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An image from the Karl G. Jansky Very Large Array of the Galaxy Hercules 
A (also known as 3C348) showing powerful synchrotron jets emerging from its 
core, the site of a supermassive black hole of 10° solar masses. The field center 
is RA = 16°51™8.1478, Dec. = 4° 59° 33.32” (2000), and the field of view is 3.3 
x 2.4 arcmin. The image has been rotated clockwise by 36 degrees. The data set 
comprised 70 hours of observations acquired in 2010 and 2011 in bands from 4.2 
to 9 GHz in all four array configurations with baselines from 36 m to 36 km. The 
image resolution is 0.3”, corresponding to a linear scale of 800 pc at a distance 
of 730 Mpc, and the image contains about 10.7 Mpixels. The dynamic range is 
about 1200. The image has been reconstructed with a multiresolution CLEAN 
algorithm and self-calibration procedures described in Chapter 11. Color coded by 
intensity. Image from the NRAO, courtesy of B. Saxton, W. Cotton, and R. Perley 
(NRAO/AUI/NSF). © NRAO. 
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Preface to the Third Edition 


The advances in radio astronomy, especially in instrumentation for interferometry, 
over the past 15 years since the second edition have been remarkable. With the 
commissioning of the Atacama Large Millimeter/submillimeter Array (ALMA), 
high-resolution radio astronomy has reached the high-frequency limit of ground- 
based observations of about 1 THz. There has been a revitalization of interest 
at low frequencies, with multiple new instruments such as the LOw Frequency 
ARray (LOFAR), the Long Wavelength Array (LWA), and the Murchison Widefield 
Array (MWA). Tremendous advances in signal-processing capabilities have enabled 
the first instruments with multiple fields of view, the Australian SKA Pathfinder 
(ASKAP) and APERITIF on the Westerbork array. VLBI has reached submillimeter 
wavelengths and is being used by the Event Horizon Telescope (EHT) to resolve 
the structure of the emission surrounding the black hole in the center of our galaxy. 
VLBI with the elements in Earth orbit, RadioAstron and VSOP, has greatly extended 
the baselines available. 

Much new material has been added to this edition. In Chap. 1, the historical 
perspective has been brought up to date. An appendix has been added where the 
radiometer equation, which gives the fundamental limitation in the sensitivity of a 
radio telescope, has been derived from basic principles. In Chap. 2, a new appendix 
gives an overview of the Fourier transform theory used throughout the book. 
Chapter 4 includes a description of the so-called measurement equation, which 
provides a unified framework for array calibration. Chapter 5 includes a description 
of the new instruments available, including the fast Fourier Transform Telescope. 
The discussion of system design has been substantially expanded in Chap. 7. 
In Chap. 8, which deals with digital signal processing, the coverage of FX-type 
correlators has been greatly expanded and the operation of polyphase filter banks 
explained. The analysis of sensitivity loss due to quantization has been generalized. 
An appendix describing the basic properties of the discrete Fourier transform has 
been added. Chapter 9 on VLBI has been updated to reflect the conversion from 
data storage on tape to data storage on disk media. With the prevalence of direct data 
transmission to correlation facilities, the distinction between VLBI and connected- 
element interferometry continues to diminish. In Chap. 10, the discussion of model 
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fitting in the (u, v) plane has been greatly expanded to reflect a trend in the field 
toward fitting the fundamental interferometric data even though image fidelity 
continues to improve dramatically. The phase and amplitude closure conditions are 
explored in greater depth because of their underlying importance in data calibration. 
In Chap. 11, advances in image processing algorithms are described, including the 
application of compressed sensing techniques. Chapter 12 describes the techniques 
underlying the tremendous advance in astrometry. Precisions of 10 microarcseconds 
are now routine as a result of progress in phase-referencing methods. In this edition, 
discussion of the propagation of the neutral atmosphere and the ionized media from 
the ionosphere to the interstellar medium has been separated into two chapters, 
Chaps. 13 and 14, because of the growth in information in these areas. Over the last 
15 years, enormous amounts of data have been acquired on site characterization, 
which are described in Chap. 13. Because of the importance of both two- and three- 
dimensional turbulence in the troposphere, a detailed analysis of the two regimes is 
given. Chapter 17, on related techniques, includes new material on the use of radio 
arrays to track satellites and space debris. It also describes the application of radio 
interferometry to remote sensing of the Earth. Such application provides important 
information on soil moisture and ocean salinity. 

In the early days of radio interferometry, measurements of the distribution of 
source intensity were usually referred to as “maps” and the associated technique 
as “mapping.” With the maturity of the field, it seems more appropriate to refer to 
the results as “images.” We have done so, except in a few cases where the term 
“map” still seems appropriate, as in the determination of the distribution of maser 
spot positions from fringe rate measurements. 

Readers who are new to the field of radio astronomy are strongly encouraged to 
study the basic principles of the field from other sources. Some of the numerous 
textbooks are listed under Further Reading at the end of Chap. 1. Of particular 
usefulness is the book The Fourer Transform and Its Applications by Ron Bracewell, 
a radio astronomer and mathematician, because of its practical approach to the 
subject. The intellectual roots of this approach can be traced to the lecture notes of 
J. A. Ratcliffe of Cambridge University, which inspired the book Fourier Transforms 
and Convolutions for the Experimentalist by Roger Jennison. 

The authors would be grateful for any feedback from the readers of this book in 
regard to pedogogical, technical, or grammatical issues or typographical errors. 

We have benefited greatly from many of our colleagues who have helped in the 
preparation of this edition. They include Betsey Adams, Kazunori Akiyama, Subra 
Ananthakrishnan, Yoshiharu Asaki, Jaap Baars, Denis Barkats, Norbert Bartel, Leo 
Benkevitch, Mark Birkinshaw, Katie Bouman, Geoff Bower, Michael Bremer, John 
Bunton, Andrew Chael, Barry Clark, Tim Cornwell, Pierre Cox, Adam Deller, 
Héléne Dickel, Phil Edwards, Ron Ekers, Pedro Elosegui, Phil Erickson, Hugh 
Garsden, John Gibson, Lincoln Greenhill, Richard Hills, Mareki Honma, Chat 
Hull, Michael Johnson, Ken Kellermann, Eric Keto, Robert Kimberk, Jonathon 
Kocz, Vladimir Kostenko, Yuri Kovalev, Laurent Loinard, Colin Lonsdale, Ryan 
Loomis, Chopo Ma, Dick Manchester, Satoki Matsushita, John McKean, Russ 
McWhirter, Arnaud Mialon, George Miley, Eric Murphy, Tara Murphy, Ramesh 


Preface to the Third Edition xi 


Narayan, Scott Paine, Nimesh Patel, Michael Pearlman, Richard Plambeck, Danny 
Price, Rurik Primiani, Simon Radford, Mark Reid, Maria Rioja, Luis Rodriguez, 
Nemesio Rodriguez-Ferndndez, Alan Rogers, Jon Romney, Katherine Rosenfeld, 
Jean Rtieger, Marion Schmitz, Fred Schwab, Mamoru Sekido, T. K. Sridharan, 
Anjali Tripathi, Harish Vedantham, Jonathan Weintroub, Alan Whitney, David 
Wilner, Robert Wilson, and Andre Young. 

JM taught a graduate course in radio astronomy at Harvard University biannually 
for 40 years. He thanks the hundreds of students who took this course for the 
feedback, stimulation, and challenges they posed. 

The publication of this edition under an Open Access license was made possible 
by grants from the D. H. Menzel Fund at Harvard University and the National Radio 
Astronomy Observatory. We are particularly grateful to Charles Alcock, director of 
the Harvard—Smithsonian Center for Astrophysics, and Anthony Beasley, director of 
the National Radio Astronomy Observatory, for their generous support of all aspects 
of this project. 

We thank John Lewis for much help with the graphics and other creative 
contributions that improved the presentation of material in this book. We are 
also grateful to Tania Burchell, Maureen Connors, Christopher Erdmann, Muriel 
Hodges, Carolyn Hunsinger, Clinton Leite, Robert Reifsnyder, and Larry Selter for 
their valuable support. 

The publication of this edition would not have been possible without the tireless 
and expert assistance of Carolann Barrett of Harvard University. An experienced 
editor with a degree in mathematics, she completed both our sentences and our 
equations. Her capacity to hold every detail of the book in her brain is truly amazing. 


Charlottesville, VA, USA A. Richard Thompson 
Cambridge, MA, USA James M. Moran 
Urbana, IL, USA George W. Swenson Jr. 


June 2016 


Preface to the Second Edition 


Half a century of remarkable scientific progress has resulted from the application of 
radio interferometry to astronomy. Advances since 1986, when this book was first 
published, have resulted in the VLBA (Very Long Baseline Array), the first array 
fully dedicated to very-long-baseline interferometry (VLBI), the globalization of 
VLBI networks with the inclusion of antennas in orbit, the increasing importance 
of spectral line observations, and the improved instrumental performance at both 
ends of the radio spectrum. At the highest frequencies, millimeter-wavelength 
arrays of the Berkeley—Illinois—Maryland Association (BIMA), the Institut de Radio 
Astronomie Millimétrique (IRAM), the Nobeyama Radio Observatory (NRO), and 
the Owens Valley Radio Observatory (OVRO), which were in their infancy in 1986, 
have been greatly expanded in their capabilities. The Submillimeter Array (SMA) 
and the Atacama Large Millimeter/submillimeter Array (ALMA), a major interna- 
tional project at millimeter and submillimeter wavelengths, are under development. 
At low frequencies, with their special problems involving the ionosphere and wide- 
field mapping, the frequency coverage of the Very Large Array (VLA) has been 
extended down to 75 MHz, and the Giant Metrewave Radio Telescope (GMRT), 
operating down to 38 MHz, has been commissioned. The Australia Telescope and 
the expanded Multi-Element Radio Linked Interferometer Network (MERLIN) have 
provided increased capability at centimeter wavelengths. 

Such progress has led to this revised edition, the intent of which is not only 
to bring the material up to date but also to expand its scope and improve its 
comprehensibility and general usefulness. In a few cases, symbols used in the first 
edition have been changed to follow the general usage that is becoming established 
in radio astronomy. Every chapter contains new material, and there are new figures 
and many new references. Material in the original Chap. 3 that was peripheral to the 
basic discussion has been condensed and moved to a later chapter. Chapter 3 now 
contains the essential analysis of the response of an interferometer. The section on 
polarization in Chap. 4 has been substantially expanded, and a brief introduction to 
antenna theory has been added to Chap. 5. Chapter 6 contains a discussion of the 
sensitivity for a wide variety of instrumental configurations. A discussion of spectral 
line observations is included in Chap. 10. Chapter 13 has been expanded to include 
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a description of the new techniques for atmospheric phase correction, and site- 
testing data and techniques at millimeter wavelengths. Chapter 14 has been added 
and contains an examination of the van Cittert-Zernike theorem and discussions of 
spatial coherence and scattering, some of which is derived from the original Chap. 3. 

Special thanks are due to a number of people for reviews or other help during the 
course of the revision. These include D. C. Backer, J. W. Benson, M. Birkinshaw, 
G. A. Blake, R. N. Bracewell, B. F. Burke, B. Butler, C. L. Carilli, B. G. Clark, 
J. M. Cordes, T. J. Cornwell, L. R. D’ Addario, T. M. Dame, J. Davis, J. L. Davis, 
D. T. Emerson, R. P. Escoffier, E. B. Fomalont, L. J. Greenhill, M. A. Gurwell, C. R. 
Gwinn, K. I. Kellermann, A. R. Kerr, E. R. Keto, S. R. Kulkarni, S. Matsushita, D. 
Morris, R. Narayan, S.-K. Pan, S. J. E. Radford, R. Rao, M. J. Reid, A. Richichi, 
A. E. E. Rogers, J. E. Salah, F. R. Schwab, S. R. Spangler, E. C. Sutton, B. E. Turner, 
R. F. C. Vessot, W. J. Welch, M. C. Wiedner, and J.-H. Zhao. For major contributions 
to the preparation of the text and diagrams, we thank J. Heidenreich, G. L. Kessler, 
P. Smiley, S. Watkins, and P. Winn. For extensive help in preparation and editing, 
we are especially indebted to P. L. Simmons. We are grateful to P. A. Vanden Bout, 
director of the National Radio Astronomy Observatory, and to I. I. Shapiro, director 
of the Harvard—Smithsonian Center for Astrophysics, for the encouragement and 
support. The National Radio Astronomy Observatory is operated by Associated 
Universities Inc. under contract with the National Science Foundation, and the 
Harvard—Smithsonian Center for Astrophysics is operated by Harvard University 
and the Smithsonian Institution. 


Charlottesville, VA, USA A. Richard Thompson 
Cambridge, MA, USA James M. Moran 
Urbana, IL, USA George W. Swenson Jr. 
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Preface to the First Edition 


The techniques of radio interferometry as applied to astronomy and astrometry 
have developed enormously in the past four decades, and the attainable angular 
resolution has advanced from degrees to milliarcseconds, a range of more than six 
orders of magnitude. As arrays for synthesis mapping! have developed, techniques 
in the radio domain have overtaken those in optics in providing the finest angular 
detail in astronomical images. The same general developments have introduced 
new capabilities in astrometry and in the measurement of the Earth’s polar and 
crustal motions. The theories and techniques that underlie these advances continue 
to evolve but have reached by now a sufficient state of maturity that it is appropriate 
to offer a detailed exposition. 

The book is intended primarily for graduate students and professionals in 
astronomy, electrical engineering, physics, or related fields who wish to use inter- 
ferometric or synthesis-mapping techniques in astronomy, astrometry, or geodesy. 
It is also written with radio systems engineers in mind and includes discussions of 
important parameters and tolerances for the types of instruments involved. Our aim 
is to explain the underlying principles of the relevant interferometric techniques but 
to limit the discussion of details of implementation. Such details of the hardware and 
the software are largely specific to particular instruments and are subject to change 
with developments in electronic engineering and computing techniques. With an 
understanding of the principles involved, the reader should be able to comprehend 
the instructions and instrumental details that are encountered in the user-oriented 
literature of most observatories. 

The book does not stem from any course of lectures, but the material included 
is suitable for a graduate-level course. A teacher with experience in the techniques 
described should be able to interject easily any necessary guidance to emphasize 
astronomy, engineering, or other aspects as required. 


'We define synthesis mapping as the reconstruction of images from measurements of the Fourier 
transforms of their brightness distributions. In this book, the terms map, image, and brightness 
(intensity) distribution are largely interchangeable. 
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The first two chapters contain a brief review of radio astronomy basics, a short 
history of the development of radio interferometry, and a basic discussion of the 
operation of an interferometer. Chapter 3 discusses the underlying relationships 
of interferometry from the viewpoint of the theory of partial coherence and may 
be omitted from a first reading. Chapter 4 introduces coordinate systems and 
parameters that are required to describe synthesis mapping. It is appropriate then 
to examine configurations of antennas for multielement synthesis arrays in Chap. 5. 
Chapters 6—8 deal with various aspects of the design and response of receiving 
systems, including the effects of quantization in digital correlators. The special 
requirements of very-long-baseline interferometry (VLBI) are discussed in Chap. 9. 
The foregoing material covers in detail the measurement of complex visibility 
and leads to the derivation of radio maps discussed in Chaps. 10 and 11. The 
former presents the basic Fourier transformation method and the latter the more 
powerful algorithms that incorporate both calibration and transformation. Precision 
observations in astrometry and geodesy are the subject of Chap. 12. There follow 
discussions of factors that can degrade the overall performance, namely, effects 
of propagation in the atmosphere, the interplanetary medium, and the interstellar 
medium in Chap. 13 and radio interference in Chap. 14. Propagation effects 
are discussed at some length since they involve a wide range of complicated 
phenomena that place fundamental limits on the measurement accuracy. The final 
chapter describes related techniques including intensity interferometry, speckle 
interferometry, and lunar occultation observations. 

References are included to seminal papers and to many other publications and 
reviews that are relevant to the topics of the book. Numerous descriptions of 
instruments and observations are also referenced for purposes of illustration. Details 
of early procedures are given wherever they are of help in elucidating the principles 
or origin of current techniques, or because they are of interest in their own right. 
Because of the diversity of the phenomena described, it has been necessary, in some 
cases, to use the same mathematical symbol for different quantities. A glossary of 
principal symbols and usage follows the final chapter. 

The material in this book comes only in part from the published literature, and 
much of it has been accumulated over many years from discussions, seminars, 
and the unpublished reports and memoranda of various observatories. Thus, we 
acknowledge our debt to colleagues too numerous to mention individually. Our 
special thanks are due to a number of people for critical reviews of portions of 
the book or for other support. These include D. C. Backer, D. S. Bagri, R. H. T. 
Bates, M. Birkinshaw, R. N. Bracewell, B. G. Clark, J. M. Cordes, T. J. Cornwell, 
L. R. D’ Addario, J. L. Davis, R. D. Ekers, J. V. Evans, M. Faucherre, S. J. Franke, 
J. Granlund, L. J. Greenhill, C. R. Gwinn, T. A. Herring, R. J. Hill, W. A. Jeffrey, 
K. I. Kellermann, J. A. Klobuchar, R. S. Lawrence, J. M. Marcaide, N. C. Mathur, 
L. A. Molnar, P. C. Myers, P. J. Napier, P. Nisenson, H. V. Poor, M. J. Reid, J. T. 
Roberts, L. F. Rodriguez, A. E. E. Rogers, A. H. Rots, J. E. Salah, F. R. Schwab, 
I. I. Shapiro, R. A. Sramek, R. Stachnik, J. L. Turner, R. F. C. Vessot, N. Wax, and 
W. J. Welch. The reproduction of diagrams from other publications is acknowledged 
in the captions, and we thank the authors and the publishers concerned for the 
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permission to use this material. For major contributions to the preparation of the 
manuscript, we wish to thank C. C. Barrett, C. F. Burgess, N. J. Diamond, J. M. 
Gillberg, J. G. Hamwey, E. L. Haynes, G. L. Kessler, K. I. Maldonis, A. Patrick, 
V. J. Peterson, S. K. Rosenthal, A. W. Shepherd, J. F. Singarella, M. B. Weems, 
and C. H. Williams. We are grateful to M. S. Roberts and P. A. Vanden Bout, 
former director and present director of the National Radio Astronomy Observatory, 
and to G. B. Field and I. I. Shapiro, former director and present director of the 
Harvard—Smithsonian Center for Astrophysics, for the encouragement and support. 
Much of the contribution by J. M. Moran was written while on sabbatical leave 
at the Radio Astronomy Laboratory of the University of California, Berkeley, 
and he is grateful to W. J. Welch for the hospitality during that period. G. W. 
Swenson Jr. thanks the Guggenheim Foundation for a fellowship during 1984-1985. 
Finally, we acknowledge the support of our home institutions: the National Radio 
Astronomy Observatory, which is operated by Associated Universities Inc. under 
contract with the National Science Foundation; the Harvard—Smithsonian Center 
for Astrophysics, which is operated by Harvard University and the Smithsonian 
Institution; and the University of Illinois. 


Charlottesville, VA, USA A. Richard Thompson 
Cambridge, MA, USA James M. Moran 
Urbana, IL, USA George W. Swenson Jr. 
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3CR Revised Cambridge Catalog of Radio Sources 

AGN Active galactic nuclei 

AIPS Astronomical Image Processing System 

ALC Automatic level control 
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AM Atmospheric model (atmospheric modeling code) 
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ASKAP Australian Square Kilometre Array Pathfinder 

ATM Atmospheric transmission of microwaves (atmospheric 
modeling code) 
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CARMA Combined Array for Research in Millimeter-Wave 
Astronomy 

CBI Cosmic Background Imager 

CCIR International Radio Consultative Committee 
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CLEAN Imaging algorithm for removal of unwanted responses due to 
point spread function 

CMB Cosmic microwave background 
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COESA Committee on the Extension of the Standard Atmosphere 
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Compressed sensing 

Caltech Submillimeter Observatory 

Chinese VLBI Network 

Continuous wave 
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Direction-dependent 

Discrete Fourier transform 

Dispersion measure 

Double sideband 

Event Horizon Telescope 
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Fast Fourier transform 
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Fourier transform before cross multiplication of data 
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Gaussian-filtered minimum shift keying 
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RFI 
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Least absolute shrinkage and selection operator 
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Local oscillator 

Length of day 

LOw Frequency ARray 

Least-mean-square fit 
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Long Wavelength Array 

Markov chain Monte Carlo 

Meer and Karoo Array Telescope 

Maximum entropy method 

Multi-Element Radio Linked Interferometer Network 
Modern-Era Retrospective Analysis for Research and 
Applications (NASA program) 

Massachusetts Institute of Technology 
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National Aeronautics and Space Administration 
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NRAO VLA Sky Survey 
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Parametrized Ionosphere Model 
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Quadri-phase-shift keying 
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Random access memory 
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SAO 
SEFD 
SI 
SIM 
SIS 
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SMOS 
SNR 
SSB 
STI 
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Root mean square 

Smithsonian Astrophysical Observatory 
System equivalent flux density 

System International (modern MKS units) 
Space Interferometry Mission 
Superconductor—insulator—superconductor 
Square Kilometre Array 

Submillimeter Array 

Soil Moisture and Ocean Salinity mission 
Signal-to-noise ratio 

Single sideband 

Satellite tracking interferometer 

Tracking and Data Relay Satellite System 

Total electron content 

Traveling ionospheric disturbance 

Total variation 

United States Naval Observatory 

Union of Soviet Socialist Republics 

Universal time 

Modified UT 

Coordinated universal time 

Ukrainian Academy of Sciences T-shaped array 
Video cassette recorder 

VLBI Exploration of Radio Astronomy (Japanese-led 
project) 

Very Large Array 

Very Long Baseline Array 

Very-long-baseline interferometry 
Very-large-scale integrated (circuits) 

Very Small Array 

VLBI Space Observatory Programme 
Wideband Interferometric Digital ARchitecture 
Wilkinson Microwave Anisotropy Probe 

Water vapor radiometer 

Cross-correlation before Fourier transformation 
Ratio of receiver power outputs with hot and cold input loads 
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Listed below are the principal symbols used throughout the book. Locally defined 
symbols with restricted usage are selectively included. 


a 


Model dimension, scale size, atmospheric model constant 
(Sect. 13.1), scale size of ionospheric irregularities 
(Sect. 14.2) 

Antenna collecting area (reception pattern) 
Antenna polarization matrix (Chap. 4) 
One-dimensional reception pattern 

Antenna collecting area on axis 

Normalized reception pattern 

Mirror-image reception pattern, azimuth 

Galactic latitude (Sect. 14.4) 

Synthesized beam pattern, point-source response 
Normalized synthesized beam pattern 

Magnetic field magnitude 

Magnetic field vector 

Velocity of light 

Constant (Chap. 1), coherence function (Chap. 9), 
convolving function (Chap. 10) 

Turbulence strength parameters for refractive index 
(Chap. 13) 

Turbulence strength, electron density (Chap. 14) 
Amplitude of a complex signal (Appendix 3.1) 
Distance, antenna diameter, baseline declination, projected 
baseline (Chap. 13) 

Fried length (Chaps. 13, 17) 

Inner scale of turbulence 
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fi 

Sins fa 

F 

Fy), 

F(B) 
Fofa 
F, Fo, F3 
Fg 

F 

Fr, Fi 


og 


ano 


Principal Symbols 


Outer scale of turbulence 

Diffractive limit 

Distance between ray paths to target and calibrator sources 
in turbulent region 

Distance over which rms phase deviation = | rad (Chap. 13) 
Transition from 2-D to 3-D turbulence 

Baseline (antenna spacing), polarization leakage (Chap. 4) 
Baseline vector 

Baseline measured in wavelengths 

Interaxis distance of antenna mount (Chap. 4) 

Equatorial component of baseline 

Dispersion measure (Chap. 13) 

Structure function of refractive index (spatial) (Chap. 13) 
Delay resolution function [Eq. (9.181)] 

Structure function of phase (temporal) (Chap. 13) 
Structure function of phase (spatial) (Chaps. 12, 13) 
Dispersion in optical fiber (Sect. 7.1, Appendix 7.2) 
Magnitude of electronic charge (Chap. 14), emissivity 
Electric field (usually in the measurement plane), spectral 
components of electric field, energy 

Components of electric field 

Electric field at a source or aperture (Chaps. 3, 15, 17), 
elevation angle 

Frequency of Fourier components of power spectrum 
(Chaps. 9, 13) 

Oscillator strength at resonance i (Chap. 13) 

Phase switching waveforms (Chap. 7) 

Power flux density (W m7’), fringe function 

Threshold of harmful interference (W m~?) (Chap. 16) 
Faraday dispersion function (Chap. 13) 

See Eq. (9.17) 

Entropy measures (Chap. 11) 

Bandwidth pattern (Chap. 2) 

Sensitivity degradation factor (Chap. 7) 

Quantized fringe-rotation functions (Chap. 9) 

Voltage gain constant for an antenna, gravitational 
acceleration (Chap. 13) 

Gravitational constant 

Power gain of receiver for one antenna (Chap. 7) 

Gain factor for a correlated antenna pair 


Principal Symbols xxxvii 


Go Gain factor (Chap. 7) 

G Occultation response function (Chap. 17) 

h Planck’s constant, impulse response of a filter (Sect. 3.3), 
hour angle of baseline, height, height above surface 

ho Atmospheric scale height (Chap. 13) 

H Hour angle, voltage—frequency response, Hadamard matrix 
(Sect. 7.5) 

Ao Gain constant 

i Electric current 

i Unit vector in direction of polar or azimuth axes (Chap. 4), 
current vector (Chap. 14) 

I Intensity, Stokes parameter 

P Variance of fractional frequency deviation (Chap. 9) 

I, Speckle intensity (Chap. 17) 

Iy Stokes visibility 

lo Peak intensity of a point source, derived (synthesized) 
intensity distribution, modified Bessel function of zero order 
(Chaps. 6, 9) 

l One-dimensional intensity function, modified Bessel 
function of first order (Chap. 9) 

Im Imaginary part 


j Vv-1 
Jones Matrix (Chap. 4) 


jv Volume emissivity of a source (Chap. 13) 

J Mutual intensity (Chap. 15) 

Jo Bessel function of first kind and zero order 

Ji Bessel function of first kind and first order 

k Boltzmann’s constant, propagation constant 
27 /X (Chap. 13) 

k Propagation vector with magnitude 27 /A (Chap. 9) 

l Direction cosine with respect to baseline component u, lapse 
rate (Chap. 13) 

L Length of a transmission line, loss factor in a transmission 


line (Chap. 7), probability integral [Eq. (8.109)], path length, 
likelihood function (Chap. 12), thickness of turbulent 
atmospheric layer or screen (Chap. 13) 

pate Lout Scales of turbulence (Chap. 13) 

£ Multipole moment (Chap. 10), length, galactic longitude 
(Chap. 13) 

L Unit spacing (in wavelengths) in a grating array (Chaps. 1, 5) 
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L£ Latitude, excess path length (Chap. 13) 
Lp, Ly Excess path length of dry air, water vapor 
m Direction cosine with respect to baseline component v, 


modulation index (Appendix 7.2), measured quantity 
(Appendix 12.1), electron mass (Chap. 13) 


me, Mc, My Degree of linear, circular, and total polarization 

M Frequency multiplication factor (Chap. 9), model function 
(Chap. 10), mass, complex degree of linear polarization 
(Chap. 13) 

M, Mp, My Molecular weight; total, dry air, water vapor (Chap. 13) 

n Direction cosine with respect to baseline component w, 


weighting factor in quantization (Chap. 8), noise component, 
index of refraction (Chap. 13) 
n=n,+ jn, Complex refractive index 


Na Number of antennas 

nd Number of data points 

Ne, Ni, Nn, Nnm Density of electrons, ions, neutral particles, and molecules 
(Chap. 13) 

Np Number of antenna pairs 

Ns Number of sources 

n, Number of points in a rectangular array (grid points) 

no Refractive index at Earth’s surface (Chap. 13) 

N Number of samples (Chap. 8), total refractivity (Chap. 13) 

Np Number of bits per sample (Chap. 8) 

Np, Ny Refractivity of; dry air, water vapor (Chap. 13) 

Nw Number of Nyquist rate samples (Chap. 8) 

N 2N and (2N + 1) are even and odd numbers of quantization 
levels (Chap. 8) 

p Probability density or probability distribution [i.e. p(x) dx is 


the probability that the random variable lies between x and 
x + dx], bivariate normal probability function (Chap. 8), 
number of model parameters (Chap. 10), partial pressure 
(Sect. 13.1), impact parameter (Sects. 12.6, 14.3) 


PD Partial pressure of dry air (Chap. 13) 

Pv Partial pressure of water vapor (Chap. 13) 

P Power, cumulative probability, total atmospheric pressure 
(Chap. 13) 

Po Atmospheric pressure at Earth’s surface (Chap. 13) 

P Dipole moment per unit volume 

P3 Triple product (bispectrum) 


P mnp Instrumental polarization factor 
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Spectrum of electron density fluctuations 

Point-source response at Moon’s limb (Sect. 17.2), speckle 
point-spread function (Sect. 17.6.4) 

Distance in (u, v) plane 

Distance in (u’, v’) plane 

Components in the spatial frequency (cycles per meter) 
plane (Chap. 13) 

Stokes parameter, quality factor of a line or cavity 

(Sect. 9.5), number of quantization levels (Sects. 8.3, 9.6) 
Stokes visibility 

Correlator output, distance in the (/, m) plane, radial distance 
Position vector of antenna relative to center of Earth 
Classical electron radius (Chap. 14) 

Correlator output resulting from lower sideband 
Pearson’s correlation coefficient 

Correlator output resulting from upper sideband 

Radius of the Earth 

Autocorrelation function, correlator output, robustness 
factor (Sect. 10.2.2.1), frequency ratio (Sect. 12.2.4), 
distance, gas constant (Chap. 13) 

Correlator output matrix (Chap. 4) 

Response with visibility averaging (Chap. 6) 

Response with finite bandwidth (Chap. 6) 

Radius of electron orbit (Chap. 14) 

Far-field distance (Chap. 15) 

Rotation measure (Chap. 14) 

Distance of the Moon’s limb (Chap. 17) 

Autocorrelation for n-level quantization (Chap. 8) 
Autocorrelation function of fractional frequency deviation 
(Chap. 9) 

Distance of Earth to Sun 

Autocorrelation function of phase (Chaps. 9, 13) 

Real part 

Signal-to-noise ratio 

Signal component, smoothness measure (Chap. 11) 

Unit position vector (Chap. 3) 

Unit position vector of field center (Chap. 3) 

(spectral) power flux density (W m~? Hz~!) 

Flux density of a calibrator 

System equivalent flux density 

Threshold of harmful interference (W m~? Hz~!) (Chap. 16) 
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Square wave functions (Sect. 7.5) (also known as 
Rademacher functions) 

Cross power spectrum (Chap. 9) 

Power spectrum of intensity fluctuations (Chap. 14) 
Single-sided and double-sided power spectra of fractional 
frequency deviation (single-sided power spectrum used only 
in Sect. 9.4) 

Single-sided and double-sided power spectra of phase 
fluctuations (single-sided power spectrum used only in 
Sect. 9.4) 

Two-dimensional power spectrum of phase (Chap. 13) 
Time 

Period of the Earth’s rotation (Chap. 12) 

Cycle period for target and calibrator sources 
Temperature, time interval, transmission factor (Chap. 15) 
Atmospheric temperature (Chap. 13) 

Component of antenna temperature resulting from target 
source 

Total antenna temperature 

Brightness temperature 

Noise temperature of calibration signal 

Gas temperature (Chap. 9) 

Receiver temperature 

System temperature 

Time interval 

Antenna spacing coordinate in units of wavelength (spatial 
frequency) 

Projection of u coordinate onto the equatorial plane 
Stokes parameter 

Stokes visibility (Chap. 4) 

Unwanted response (Sect. 7.5) 

Antenna spacing coordinate in units of wavelength (spatial 
frequency), phase velocity in a transmission line (Chap. 8) 
Projection of v coordinate onto the equatorial plane 

Group velocity (Chap. 14) 

Rate of angular motion of Moon’s limb (Chap. 16) 

Phase velocity (Chap. 13) 

Radial velocity 

Velocity of scattering screen (parallel to baseline, if relevant) 
(Chaps. 12, 13) 

Quantization level (Chap. 8), particle velocity (Chap. 9) 
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XA 


Yk 
ya 


Zp, Zv 


Voltage, Stokes parameter 

Voltage response of an antenna 

Stokes visibility (Chap. 4) 

Complex visibility, vector visibility 

Measured complex visibility 

Michelson’s fringe visibility 

Normalized complex visibility 

Antenna spacing coordinate in units of wavelength (spatial 
frequency), weighting function, column height of 
precipitable water (Chap. 13) 

w coordinate measured in the polar direction 
Atmospheric weighting function (Chap. 13) 

Mean of weighting factors (Chap. 6) 

Root-mean-square of weighting factors (Chap. 6) 
Visibility tapering function (Chap. 10) 

Function that adjusts visibility amplitude for effective 
uniform weighting (Chap. 10) 

Spectral sensitivity function (spatial transfer function); 
propagator (Chap. 15) 

General position coordinate, coordinate in antenna aperture, 
signal voltage 

x coordinate measured in wavelengths 

Coordinate of antenna spacing [see Eq. (4.1)], signal 
waveform measured in units of rms amplitude (Chap. 8), 
coordinate within a source or an aperture (Chaps. 3, 15), 
signal spectrum (Sect. 8.7) 

X coordinate measured in wavelengths 

General position coordinate, coordinate in antenna aperture, 
signal voltage, distance along a ray path (Chap. 13) 
Fractional frequency deviation (Chap. 9) 

y coordinate measured in wavelengths 

Coordinate of antenna spacing [Eq. (4.1)], Y-factor 
(Chap. 7), coordinate within a source or aperture 
(Chaps. 3, 15), signal waveform measured in units of rms 
amplitude (Sect. 8.4), signal spectrum (Sect. 8.7) 

Y coordinate measured in wavelengths 

General position coordinate, signal voltage, zenith angle 
(Chap. 13), redshift 

z coordinate measured in wavelengths 

Coordinate of antenna spacing [Eq. (4.1)], visibility plus 
noise in correlator output (Chaps. 6, 9) 


Compressibility factors for dry air and water vapor 
(Chap. 13) 
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Visibility-plus-noise vector (Chaps. 6, 9) 

Z coordinate measured in wavelengths 

Right ascension, power attenuation coefficient, quantization 
threshold in units of o (Chap. 8), spectral index (Chap. 11), 
absorption coefficient and power-law exponent in Table 13.2 
and related text (Sect. 13.1), exponent in electron density 
fluctuation (Sect. 13.4) 

Fractional length change in transmission line (Chap. 7), 
oversampling factor (Chap. 8), exponent of distance in rms 
phase fluctuation [Eq. (13.80a)] (Sects. 12.2, 13.1), exponent 
in solar electron density (Sect. 14.3), Faraday depth 

(Sect. 14.4) 

Instrumental polarization factor (Sect. 4.8), maser relaxation 
rate (Chap. 9), loop gain in CLEAN (Chap. 11), 
post-Newtonian GR parameter (Chap. 12), source coherence 
function (Chap. 15) 

Damping factor (Chap. 13), mutual coherence function 
(Chap. 15), gamma function 

Mutual coherence function (Chap. 15) 

Declination, increment prefix, (Dirac) delta function, 
instrumental polarization factor (Sect. 4.8) 

Delta function in two dimensions 

Small length, increment prefix 

Bandwidth, Doppler shift (Appendix 10.2) 

Intermediate frequency bandwidth 

Low frequency bandwidth 

Frequency difference of local oscillators 

Delay error 

Increments in (u, v) plane 

Increments in (l, m) plane 

Solar elongation (Sect. 12.6) 

Width of quantization level in units of o (Chap. 8), noise 
component in IF signal (Chap. 9), permittivity (Chap. 13) 
Amplitude error (Chap. 11) 

Permittivity of free space (Chap. 13) 

Noise component of correlator output (Chaps. 6, 9), 
residual, error component, dielectric constant (Chap. 13, 
Sect. 17.5), random surface deviation (Chap. 17) 

Noise vector (Chap. 6) 

Loss factor 
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ND Discrete delay step-loss factor 

no Efficiency (loss) factor for Q-level quantization 

NR Fringe rotation loss factor 

ns Fringe sideband rejection loss factor 

0 General angle, angle measured from a plane normal to the 


baseline, instrumental phase angle, angle between baseline 
and source direction vector (Chap. 12) 


Oo Angular position of source or field center 

Op Width of synthesized beam, bending angle (Chap. 13) 

Oy Width of synthesized field (field of view) 

Or Width of first Fresnel zone 

Lo Local oscillator phase 

Om, On Local oscillator phase at antennas m and n (Chap. 6) 

b; Effective beamwidth resulting from atmospheric fluctuations 
(Chap. 13), width of source (Chap. 16) 

© Variation in Earth-rotation angle (UT1—UTC) (Chap. 12) 

À Wavelength 

Àopt Wavelength of optical carrier (Appendix 7.2) 

A Reflected amplitude in a transmission line (Chap. 7) 

u Power-law exponent in Allan variance (Chap. 9) 

v Frequency 

v’ Frequency measured with respect to center frequency or 
local oscillator frequency (Chap. 9) 

Vp Bit rate 

VB Gyrofrequency (Chap. 13) 

Ve Collision frequency (Chap. 13) 

ve Cavity frequency (Chap. 9) 

Va Intermediate frequency at which delay is inserted 

Vas Delay step frequency (Chap. 9) 

vp Fringe frequency 

Vin Instrumental component of fringe frequency (Chap. 12) 

VIF Intermediate frequency 

VLO Local oscillator frequency 

ve Frequency of a correlator channel (Chap. 9) 

Vm Frequency of modulation on optical carrier (Chap. 7) 

VRF Radio frequency 

Vopt Frequency of optical carrier (Appendix 7.2) 

Vp Plasma frequency (Chap. 13) 

Vo Center frequency of an IF or RF band, frequency of 


absorption peak (Chap. 13) 
IT Parallax angle (Chap. 12) 
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PD, Pv, PT 
Pmn 

Po 

Pm, Pn 

Pw 


Pm 
dv 
PG, Pin 
Ppp 


Principal Symbols 


Autocorrelation function, cross-correlation coefficient, 
reflection coefficient (Chap. 7), gas density (Chap. 13) 
Density: dry air, water vapor, total (Chap. 13) 
Cross-correlation 

Area density in the (u, v) plane (Chap. 10) 

Reflection coefficients in transmission line (Chap. 7) 
Density of water (Chap. 13) 

Standard deviation, rms noise level; radar cross section 
(Chap. 17) 

Position vector on the unit sphere 

Allan standard deviation (0 = Allan variance) 
Root-mean-square uncertainty in delay (Chap. 9) 
Root-mean-square deviation of phase 

Time interval 

Averaging (integration) time 

Atmospheric delay error (Chap. 12) 

Coherent integration time (Chap. 9) 

Clock error 

Geometric delay 

Instrumental delay 

Unit increment of instrumental delay, duration of an 
observation (Chap. 6), zenith optical depth (opacity) of the 
atmosphere (Chap. 13) 

Sampling interval in time 

Minimum period of orthogonality (Chap. 7) 

Interval between switch transitions (Chap. 7) 

Optical depth (opacity) (Chap. 13) 

Phase angle 

Phase of signal received by antenna m 

Visibility phase 

Instrumental phase for correlated antenna pair 
Peak-to-peak phase error (Chap. 9) 

Phase of a complex signal (Appendix 3.1), probability 
integral [Eq. (8.44)] (Chap. 8), phase of a signal (Sect. 13.1) 
Arctangent of axial ratio of polarization ellipse 
Chi-squared statistical parameter 

Position angle, phase angle 

Parallactic angle 

Angular rotation velocity of the Earth 

Solid angle 

Solid angle subtended by source 

Solid angle of main lobe of synthesized beam 
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Frequently Used Subscripts 


> <S nARr* Ses 


Other Symbols 


kk 

( ) 

dot (`) 
double dot (") 
overline (~) 


circumflex (*) 
circumflex ( ) 


Antenna 

Delay, double sideband 

Dry component (Chap. 13) 

Imaginary part 

Intermediate frequency 

Left circular polarization, lower sideband 
Local oscillator 


Center of frequency band or angular field, Earth’s surface 


(Chap. 13) 

Antenna designation 

Normalized, Nyquist rate (Sects. 8.2, 8.3) 
Right circular polarization 

Real part 

System 

Upper sideband 

Water vapor (Chap. 13) 

Measured in wavelengths 


Unit rectangle function 

Product symbol 

Shah function in one dimension 

Shah function in two dimensions 

“is the Fourier transform of” 

Convolution in one dimension 

Convolution in two dimensions 

Cross correlation in one dimension 

Cross correlation in two dimensions 

Expectation (or approximation by a finite average) 
First derivative with respect to time 

Second derivative with respect to time 

Average (Chaps. 1, 9, Sect. 14.1); Fourier transform of 
function (Chaps. 3, 5, 8, 10, 11, 13, Sect. 14.2) 
Quantized variable (Chap. 8) 

Function of frequency (Chap. 3) 


xlvi Principal Symbols 


Angular Notation 


ZAN, Degrees, minutes of arc, and seconds of arc 
mas Milliarcseconds 

pas Microarcseconds 

Functions 


For definitions and descriptions, see, e.g., Abramowitz, M., and Stegun, I.A., 
Handbook of Mathematical Functions, National Bureau of Standards, Washington, 
DC (1964), reprinted by Dover, New York, (1965). 


erf Error function [Eq. (6.63c)] 

Jo Bessel function of first kind and zero order [Eq. (A2.55)] 
Jı Bessel function of first kind and first order 

lo Modified Bessel function of zero order [Eq. (9.46)] 

l Modified Bessel function of first order [Eq. (9.52)] 

T Gamma function [note that T (x + 1) = xT œ] 

6 Dirac delta function [Eq. (A2.10)] 

[| Unit rectangle function [Eq. (A2.12a)] 

[| Modified unit rectangle function [Table 10.2] 

sinc sin 2x/(sx) [Eq. (2.4)] 


Chapter 1 
Introduction and Historical Review 


The subject of this book can be broadly described as the principles of radio 
interferometry applied to the measurement of natural radio signals from cosmic 
sources. The uses of such measurements lie mainly within the domains of astro- 
physics, astrometry, and geodesy. As an introduction, we consider in this chapter 
the applications of the technique, some basic terms and concepts, and the historical 
development of the instruments and their uses. 

The fundamental concept of this book is that the image, or intensity distribution, 
of a source has a Fourier transform that is the two-point correlation function of the 
electric field, whose components can be directly measured by an interferometer. 
This Fourier transform is normally called the fringe visibility function, which in 
general is a complex quantity. The basic formulation of this principle is called the 
van Cittert-Zernike theorem (see Chap. 15), derived in the 1930s in the context 
of optics but not widely appreciated by radio astronomers until the publication 
of the well-known textbook Principles of Optics by Born and Wolf (1959). The 
techniques of radio interferometry developed from those of the Michelson stellar 
interferometer without specific knowledge of the van Cittert—Zernike theorem. 
Many of the principles of interferometry have counterparts in the field of X-ray 
crystallography (see Beevers and Lipson 1985). 


1.1 Applications of Radio Interferometry 


Radio interferometers and synthesis arrays, which are basically ensembles of two- 
element interferometers, are used to make measurements of the fine angular detail 
in the radio emission from the sky. The angular resolution of a single radio 
antenna is insufficient for many astronomical purposes. Practical considerations 
limit the resolution to a few tens of arcseconds. For example, the beamwidth of 
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2 1 Introduction and Historical Review 


a 100-m-diameter antenna at 7-mm wavelength is approximately 17”. In the optical 
range, the diffraction limit of large telescopes (diameter ~ 8 m) is about 0.015”, but 
the angular resolution achievable from the ground by conventional techniques (i.e., 
without adaptive optics) is limited to about 0.5” by turbulence in the troposphere. 
For progress in astronomy, it is particularly important to measure the positions of 
radio sources with sufficient accuracy to allow identification with objects detected 
in the optical and other parts of the electromagnetic spectrum [see, for example, 
Kellermann (2013)]. It is also very important to be able to measure parameters such 
as intensity, polarization, and frequency spectrum with similar angular resolution in 
both the radio and optical domains. Radio interferometry enables such studies to be 
made. 

Precise measurement of the angular positions of stars and other cosmic objects 
is the concern of astrometry. This includes the study of the small changes in 
celestial positions attributable to the parallax introduced by the Earth’s orbital 
motion, as well as those resulting from the intrinsic motions of the objects. Such 
measurements are an essential step in the establishment of the distance scale of 
the Universe. Astrometric measurements have also provided a means to test the 
general theory of relativity and to establish the dynamical parameters of the solar 
system. In making astrometric measurements, it is essential to establish a reference 
frame for celestial positions. A frame based on extremely distant high-mass objects 
as position references is close to ideal. Radio measurements of distant, compact, 
extragalactic sources presently offer the best prospects for the establishment of such 
a system. Radio techniques provide an accuracy of the order of 100 uas or less 
for absolute positions and 10 pas or less for the relative positions of objects closely 
spaced in angle. Optical measurements of stellar images, as seen through the Earth’s 
atmosphere, allow the positions to be determined with a precision of about 50 mas. 
However, positions of 10° stars have been measured to an accuracy of ~1 mas with 
the Hipparcos satellite (Perryman et al. 1997). The Gaia! mission is expected to 
provide the positions of 10° stars to an accuracy of ~10 uas (de Bruijne et al. 2014). 

As part of the measurement process, astrometric observations include a deter- 
mination of the orientation of the instrument relative to the celestial reference 
frame. Ground-based observations therefore provide a measure of the variation of 
the orientation parameters for the Earth. In addition to the well-known precession 
and nutation of the direction of the axis of rotation, there are irregular shifts of 
the Earth’s axis relative to the surface. These shifts, referred to as polar motion, 
are attributed to the gravitational effects of the Sun and Moon on the equatorial 
bulge of the Earth and to dynamic effects in the Earth’s mantle, crust, oceans, and 
atmosphere. The same causes give rise to changes in the angular rotation velocity 
of the Earth, which are manifest as corrections that must be applied to the system 
of universal time. Measurements of the orientation parameters are important in the 
study of the dynamics of the Earth. During the 1970s, it became clear that radio 
techniques could provide an accurate measure of these effects, and in the late 1970s, 


! An astrometric space observatory of the European Space Agency. 
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the first radio programs devoted to the monitoring of universal time and polar motion 
were set up jointly by the U.S. Naval Observatory and the U.S. Naval Research 
Laboratory, and also by NASA and the National Geodetic Survey. Polar motion 
can also be studied with satellites, in particular the Global Positioning System, but 
distant radio sources provide the best standard for measurement of Earth rotation. 

In addition to revealing angular changes in the motion and orientation of the 
Earth, precise interferometer measurements entail an astronomical determination of 
the vector spacing between the antennas, which for spacings of ~ 100 km or more 
is usually more precise than can be obtained by conventional surveying techniques. 
Very-long-baseline interferometry (VLBI) involves antenna spacings of hundreds 
or thousands of kilometers, and the uncertainty with which these spacings can be 
determined has decreased from a few meters in 1967, when VLBI measurements 
were first made, to a few millimeters. Relative motions of widely spaced sites on 
separate tectonic plates lie in the range 1-10 cm per year and have been tracked 
extensively with VLBI networks. Interferometric techniques have also been applied 
to the tracking of vehicles on the lunar surface and the determination of the positions 
of spacecraft. In this book, however, we limit our concern mainly to measurements 
of natural signals from astronomical objects. 

The attainment of the highest angular resolution in the radio domain of the 
electromagnetic spectrum results in part from the ease with which radio frequency 
(RF) signals can be processed electronically with high precision. The use of the 
heterodyne principle to convert received RF signals to a convenient baseband, by 
mixing them with a signal from a local oscillator, is essential to this technology. 
A block diagram of an idealized standard receiving system (also known as a 
radiometer) is shown in Appendix 1.1. Another advantage in the radio domain is that 
the phase variations induced by the Earth’s neutral atmosphere are less severe than 
at shorter wavelengths. Future technology will provide even higher resolution at 
infrared and optical wavelengths from observatories above the Earth’s atmosphere. 
However, radio waves will remain of vital importance in astronomy since they reveal 
objects that do not radiate in other parts of the spectrum, and they are able to pass 
through galactic dust clouds that obscure the view in the optical range. 


1.2 Basic Terms and Definitions 


This section is written for readers who are unfamiliar with the basics of radio 
astronomy. It presents a brief review of some background information that is useful 
when approaching the subject of radio interferometry. 


4 1 Introduction and Historical Review 
1.2.1 Cosmic Signals 


The voltages induced in antennas by radiation from cosmic radio sources are 
generally referred to as signals, although they do not contain information in the 
usual engineering sense. Such signals are generated by natural processes and almost 
universally have the form of Gaussian random noise. That is to say, the voltage as a 
function of time at the terminals of a receiving antenna can be described as a series of 
very short pulses of random occurrence that combine as a waveform with Gaussian 
amplitude distribution. In a bandwidth Av, the envelope of the radio frequency 
waveform has the appearance of random variations with timescale of order 1/ Av. 
For most radio sources (except, for example, pulsars), the characteristics of the 
signals are invariant with time, at least on the scale of minutes or hours, the duration 
of a typical radio astronomy observation. Gaussian noise of this type is assumed to 
be identical in character to the noise voltages generated in resistors and amplifiers 
and is sometimes called Johnson noise. Such waveforms are usually assumed to be 
stationary and ergodic, that is, ensemble averages and time averages converge to 
equal values. 

Most of the power is in the form of continuum radiation, the power spectrum 
of which shows gradual variation with frequency. For some wideband instruments, 
there may be significant variation within the receiver bandwidth. Figure 1.1 shows 
continuum spectra of eight different types of radio sources. Radio emission from the 
radio galaxy Cygnus A, the supernova remnant Cassiopeia A, and the quasar 3C48 
is generated by the synchrotron mechanism [see, e.g., Rybicki and Lightman (1979), 
Longair (1992)], in which high-energy electrons in magnetic fields radiate as a result 
of their orbital motion. The radiating electrons are generally highly relativistic, 
and under these conditions, the radiation emitted by each one is concentrated in 
the direction of its instantaneous motion. An observer therefore sees pulses of 
radiation from those electrons whose orbital motion lies in, or close to, a plane 
containing the observer. The observed polarization of the radiation is mainly linear, 
and any circularly polarized component is generally quite small. The overall linear 
polarization from a source, however, is seldom large, since it is randomized by the 
variation of the direction of the magnetic field within the source and by Faraday 
rotation. The power in the electromagnetic pulses from the electrons is concentrated 
at harmonics of the orbital frequency, and a continuous distribution of electron 
energies results in a continuum radio spectrum. The individual pulses from the 
electrons are too numerous to be separable, and the electric field appears as a 
continuous Gaussian random process with zero mean. The variation of the spectrum 
as a function of frequency is related to the energy distribution of the electrons. 
At low frequencies, these spectra turn over due to the effect of self-absorption. 
M82 is an example of a starburst galaxy. At low frequencies, synchrotron emission 
dominates, but at high frequencies, emission from dust grains at a temperature 
of about 45K and emissivity of 1.5 dominates. TW Hydrae is a star with a 
protoplanetary disk whose emission at radio frequencies is dominated by dust at 
a temperature of about 30K and emissivity of 0.5. 
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Fig. 1.1 Examples of spectra of eight different types of discrete continuum sources: Cassiopeia A 
[supernova remnant, Baars et al. (1977)], Cygnus A [radio galaxy, Baars et al. (1977)], 
3C48 [quasar, Kellermann and Pauliny-Toth (1969)], M82 [starburst galaxy, Condon (1992)], 
TW Hydrae [protoplanetary disk, Menu et al. (2014)], NGC7207 [planetary nebula, Thompson 
(1974)], MWC349A [ionized stellar wind, Harvey et al. (1979)], and Venus [planet, at 9.6” 
diameter (opposition), Gurwell et al. (1995)]. For practical purposes, we define the edges of the 
radio portion of the electromagnetic spectrum to be set by the limits imposed by ionospheric 
reflection at low frequencies (~ 10 MHz) and to atmospheric absorption at high frequencies 
(~ 1000 GHz). Some of the data for this table were taken from NASA/IPAC Extragalactic 
Database (2013) [One jansky (Jy) = 10776 W m~? Hz7!]. 


NGC7027, the spectrum of which is shown in Fig. 1.1, is a planetary nebula 
within our Galaxy in which the gas is ionized by radiation from a central star. The 
radio emission is a thermal process and results from free-free collisions between 
unbound electrons and ions within the plasma. At the low-frequency end of the 
spectral curve, the nebula is opaque to its own radiation and emits a blackbody 
spectrum. As the frequency increases, the absorptivity, and hence the emissivity, 
decrease approximately as v`? [see, e.g., Rybicki and Lightman (1979)], where v is 
the frequency. This behavior counteracts the v? dependence of the Rayleigh-Jeans 
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law, and thus the spectrum becomes nearly flat when the nebula is no longer opaque 
to the radiation. Radiation of this type is randomly polarized. MWC349A is an 
example of an inhomogeneous ionized gas expanding at constant velocity in a stellar 
envelope, which gives rise to a spectral dependence of v°. 

At millimeter wavelengths, opaque thermal sources such as planetary bodies 
become very strong and often serve as calibrators. Venus has a brightness temper- 
ature that varies from 700K (the surface temperature) at low frequencies to 250K 
(the atmospheric temperature) at high frequencies. 

In contrast with continuum radiation, spectral line radiation is generated at 
specific frequencies by atomic and molecular processes. A fundamentally important 
line is that of neutral atomic hydrogen at 1420.405 MHz, which results from the 
transition between two energy levels of the atom, the separation of which is related 
to the spin vector of the electron in the magnetic field of the nucleus. The natural 
width of the hydrogen line is negligibly small (~ 107! Hz), but Doppler shifts 
caused by thermal motion of the atoms and large-scale motion of gas clouds spread 
the line radiation. The overall Doppler spread within our Galaxy covers several 
hundred kilohertz. Information on galactic structure is obtained by comparison of 
these velocities with those of models incorporating galactic rotation. 

Our Galaxy and others like it also contain large molecular clouds at temperatures 
of 10-100 K in which new stars are continually forming. These clouds give rise to 
many atomic and molecular transitions in the radio and far-infrared ranges. More 
than 4,500 molecular lines from approximately 180 molecular species have been 
observed [see Herbst and van Dishoeck (2009)]. Lists of atomic and molecular 
lines are given by Jet Propulsion Laboratory (2016), the University of Cologne 
(2016), and Splatalogue (2016). For earlier lists, see Lovas et al. (1979) and Lovas 
(1992). A few of the more important lines are given in Table 1.1. Note that this 
table contains less than 1% of the known lines in the frequency range below 1 THz. 
Figure 1.2 shows the spectrum of radiation of many molecular lines from the Orion 
Nebula in the bands from 214 to 246 and from 328 to 360 GHz. Although the radio 
window in the Earth’s atmosphere ends above ~ 1 THz, sensitive submillimeter- 
and millimeter-wavelength arrays can detect such lines as the 2P, R> 2P, /2 line 
of CII at 1.90054 THz (158 um), which are Doppler shifted into the radio window 
for redshifts (z) greater than ~ 2. Some of the lines, notably those of OH, H20, 
SiO, and CH30H, show very intense emission from sources of very small apparent 
angular diameter. This emission is generated by a maser process [see, e.g., Reid and 
Moran (1988), Elitzur (1992), and Gray (2012)]. 

The strength of the radio signal received from a discrete source is expressed as 
the spectral flux density, or spectral power flux density, and is measured in watts 
per square meter per hertz (W m~? Hz7'). For brevity, astronomers often refer to 
this quantity as flux density. The unit of flux density is the jansky (Jy); 1 Jy = 107%% 
W m~? Hz"!. It is used for both spectral line and continuum radiation. The measure 
of radiation integrated in frequency over a spectral band has units of W m~? and 
is referred to as power flux density. In the standard definition of the IEEE (1977), 
power flux density is equal to the time average of the Poynting vector of the wave. In 
producing an image of a radio source, the desired quantity is the power flux density 
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Table 1.1 Some important radio lines 


Chemical name 
Deuterium 
Hydrogen 
Hydroxyl radica 
Hydroxyl radical 
Hydroxyl radica 
Hydroxyl radical 
Methyladyne 
Hydroxyl radica 
Formaldehyde 


Hydroxyl radica 
Methanol 
Helium 
Methanol 
Formaldehyde 
Cyclopropenylidene 
Water 

Ammonia 
Ammonia 
Ammonia 

Methanol 

Silicon monoxide 
Silicon monoxide 
Carbon monosulfide 
Silicon monoxide 
Hydrogen cyanide 
Formylium 
Diazenylium 
Carbon monosulfide 
Carbon monoxide 
Carbon monoxide 
Carbon monoxide 
Carbon monoxide 
Carbon monosulfide 
Water 

Carbon monoxide 
Carbon monosulfide 
Water 

Carbon monosulfide 
Carbon monoxide 
Water 

Carbon monoxide 


Heavy water 
Carbon 

Water 

Ammonia 
Carbon monoxide 
Carbon monoxide 
Carbon 


*Strong maser transition. 
>High atmospheric opacity (see Fig. 13.14). 


Chemical 
formula 


Transition 

281/25 F= 
251/2, F= 
2/2, J= 
2T j2, J= 
2TT3/2, J= 
2T j2, J= 
Mij J= 
2j, J= 


Tg 
1-0 

3/2, F=1—>2 
3/2, F=1>1 
3/2, F=2>2 
3/2, F=2>1 
1/2, F=1>1 
1/2, F=1->0 


lio — 111, six F transitions 


2173/2, J= 


5/2, F=3 > 3 


51 > 6) At 


28/2, F= 


1-0 


20 > 3-1, E 


211 > 212, 
lio > lor 
616 > 523, 


1, 1 > 1, 1, eighteen F transitions 
2, 2 — 2, 2, seven F transitions 


3, 3 — 3, 3, seven F transitions 


four F transitions 


five F transitions 


62 > 6, E 
v=2,J=1->0 
v=1, J =1>0 


J=1>0 


p= 1, J= 21 


J=1>0 
J=1>0 
J=1>0 
J=2—>1 
J=1>0 
J=1>0 
J=1>0 
J=1>0 
J=3->2 
313 —> 220 
J=2>1 
J=5>4 
515 > 422 
J=7>6 
J=3>2 
414 > 321 
J=4—>3 
lo1 — O00 
`P, > 3Ppo 
lio > loi 
lo = 
J=6>5 
J=7—>6 
3P > *P, 


, three F transitions 


, seven F transitions 


, three F transitions 


Frequency 
(GHz) 
0.327 
1.420 
1.612? 
1.665° 
1.667" 
Li 
3.335 
4.766" 
4.830 
6.035* 
6.668" 
8.665 
12.1798 
14.488 
18.343 
22,235" 
23.694 
23.723 
23.870 
25.018 
42.821° 
43.1227 
48.991 
86.243" 
88.632 
89.189 
93.174 
97.981 
109.782 
110.201 
112.359 
115.271 
146.969 
183.310" 
230.538 
244.936 
325.153" 
342.883 
345.796 
380.197> 
461.041 
464.925 
492.162 
556.936? 
572.498 
691.473 
806.652 
809.340 
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Fig. 1.3 Elements of solid s 
angle and surface area 
illustrating the definition of 
intensity. dA is normal to s. Ti 
dQ 


emitted per unit solid angle subtended by the radiating surface, which is measured 
in units of W m~? Hz"! sr~!. This quantity is variously referred to as the intensity, 
specific intensity, or brightness of the radiation. In radio astronomical imaging, we 
can measure the intensity in only two dimensions on the surface of the celestial 
sphere, and the measured emission is the component normal to that surface, as seen 
by the observer. 

In radiation theory, the quantity intensity, or specific intensity, often represented 
by I, is the measure of radiated energy flow per unit area, per unit time, per 
unit frequency bandwidth, and per unit solid angle. Thus, in Fig. 1.3, the power 
flowing in direction s within solid angle d2, frequency band dv, and area dA is 
I,(s) d92 dv dA. This can be applied to emission from the surface of a radiating 
object, to propagation through a surface in space, or to reception on the surface of 
a transducer or detector. The last case applies to reception in an antenna, and the 
solid angle then denotes the area of the celestial sphere from which the radiation 
emanates. Note that in optical astronomy, the specific intensity is usually defined as 
the intensity per unit bandwidth ,, where , = I,v’/c, and c is the speed of light 
[see, e.g., Rybicki and Lightman (1979)]. 

For thermal radiation from a blackbody, the intensity is related to the physical 
temperature T of the radiating matter by the Planck formula, for which 


hv 
= 2kTv? IT a D 
~ c2 ehv/kT _ 1 ? ` 


v 


where k is Boltzmann’s constant, and h is Planck’s constant. When hv < kT, we 
can use the Rayleigh-Jeans approximation, in which case the expression in the 
square brackets is replaced by unity. The Rayleigh—Jeans approximation requires 
v (GHz) < 20 T (K) and is violated at high frequencies and low temperatures 
in many situations of interest to radio astronomers. However, for any radiation 
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mechanism, a brightness temperature Tg can be defined: 


_ C71, 
~ 2kv2 ` 


Tg (1.2) 


In the Rayleigh-Jeans domain, the brightness temperature Tg is that of a blackbody 
at physical temperature T = Tz. In the examples in Fig. 1.1, Tg is of the order of 
104 K for NGC7027 and corresponds to the electron temperature. For Cygnus A and 
3C48, Tz is of the order of 108 K or greater and is a measure of the energy density of 
the electrons and the magnetic fields, not a physical temperature. As a spectral line 
example, Tg for the carbon monoxide (CO) lines from molecular clouds is typically 
10-100 K. In this case, Tg is proportional to the excitation temperature associated 
with the energy levels of the transition and is related to the temperature and density 
of the gas as well as to the temperature of the radiation field. 


1.2.2 Source Positions and Nomenclature 


The positions of radio sources are measured in the celestial coordinates right 
ascension and declination. On the celestial sphere, these quantities are analogous, 
respectively, to longitude and latitude on the Earth but tied to the plane of the 
Earth’s orbit around the Sun. The zero of right ascension is arbitrarily chosen as the 
point at which the Sun crosses the celestial equator (going from negative to positive 
declination) on the vernal equinox at the first point of Aries at a given epoch. Posi- 
tions of objects in celestial coordinates vary as a result of precession and nutation 
of the Earth’s axis of rotation, aberration, and proper motion. These positions are 
usually listed for the standard epoch of the year 2000. Former standard epochs were 
1950 and 1900. Methods of naming sources have proceeded haphazardly over the 
centuries. Important optical catalogs of sources were constructed as numerical lists, 
often in order of right ascension. Examples include the Messier catalog of nonstellar 
objects (Messier 1781; now containing 110 objects identified as galaxies, nebulae, 
and star clusters), the New General Catalog of nonstellar sources (Dreyer 1888; 
originally with 7,840 objects, mostly galaxies), and the Henry Draper catalog of 
stars (Cannon and Pickering 1924; now with 359,083 entries). The earliest radio 
sources were designated by their associated constellation. Hence, Cygnus A is the 
strongest source in the constellation of Cygnus. As the radio sky was systematically 
surveyed, catalogs appeared such as the third Cambridge catalog (3C), with 471 
entries in the original list [Edge et al. (1959), extragalactic sources, e.g., 3C273] and 
the Westerhout catalog of 81 sources along the galactic plane [Westerhout (1958); 
mostly ionized nebula, e.g., W3]. 

In 1974, the International Astronomical Union adopted a resolution (Interna- 
tional Astronomical Union 1974) to standardize the naming of sources based on 
their coordinates in the epoch of 1950 called the 4 + 4 system, in which the first 
four characters give the hour and minutes of right ascension (RA); the fifth, the 
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sign of the declination (Dec.); and the remaining three, the degrees and tenths 
of degrees of declination. For example, the source at RA 01534™49.835, Dec. 
32°54’20.5” would be designated 0134+329. Note that coordinates were truncated, 
not rounded. This system no longer has the accuracy needed to distinguish among 
sources. The current recommendation of the IAU Task Group on Astronomical 
Designations [International Astronomical Union (2008); see also NASA/IPAC 
Extragalactic Database (2013)] recommends the following convention. The source 
name begins with an identification acronym followed by a letter to identify the 
type of coordinates, followed by the coordinates to requisite accuracy. Examples 
of identification acronyms are QSO (quasi-stellar object), PSR (pulsar), and PKS 
(Parkes Radio Source). Coordinate identifiers are usually limited to J for epoch 
2000, B for epoch 1950, and G for galactic coordinates. Hence, the radio source at 
the center of the galaxy M87, also known as NGC4486, contains an active galactic 
nucleus (AGN) centered at RA = 12530™49.42338°, Dec. = 12°23/28.0439”, 
which might be designated AGN J1230494233+ 122328043. It is also well known 
by the designations Virgo A and 3C274. Many catalogs of radio sources have 
been made, and some of them are described in Sect. 1.3.8. An index of more than 
50 catalogs made before 1970, identifying more than 30,000 extragalactic radio 
sources, was compiled by Kesteven and Bridle (1971). 

An example of a more recent survey is the NRAO VLA Sky Survey (NVSS) 
conducted by Condon et al. (1998) using the Very Large Array (VLA) at 1.4 GHz, 
which contains approximately 2 x 10° sources (about one source per 100 beam 
solid angles). Another important catalog derived from VLBI observations is the 
International Celestial Reference Frame (ICRF), which contains 295 sources with 
positions accurate to about 40 microarcseconds (Ma et al. 1998; Fey et al. 2015). 


1.2.3 Reception of Cosmic Signals 


The antennas used most commonly in radio astronomy are of the reflector type 
mounted to allow tracking over most of the sky. The exceptions are mainly 
instruments designed for meter or longer wavelengths. The collecting area A of a 
reflector antenna, for radiation incident in the center of the main beam, is equal to 
the geometrical area multiplied by an aperture efficiency factor, which is typically 
within the range 0.3-0.8. The received power P4 delivered by the antenna to a 
matched load in a bandwidth Av, from a randomly polarized source of flux density 
S, assumed to be small compared to the beamwidth, is given by 


Pa = 5SAAv . (1.3) 


Note that S is the intensity /,, integrated over the solid angle of the source. The factor 
5 takes account of the fact that the antenna responds to only one-half the power in 
the randomly polarized wave. It is often convenient to express random noise power, 
P, in terms of an effective temperature T, as 


P=kTAv, (1.4) 
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where k is Boltzmann’s constant. In the Rayleigh-Jeans domain, P is equal to the 
noise power delivered to a matched load by a resistor at physical temperature T 
(Nyquist 1928). In the general case, if we use the Planck formula [Eq. (1.1)], we can 
write P = kTpianck Av, where Tpianck is an effective radiation temperature, or noise 
temperature, of a load at physical temperature T, and is given by 


hv 
Tomi = T|—_—| . 1.5 
Planck | | (1.5) 


The noise power in a receiving system (see Appendix 1.1) can be specified in 
terms of the system temperature Ts associated with a matched resistive load that 
would produce an equal power level in an equivalent noise-free receiver when 
connected to the input terminals. Ts is defined as the power available from this 
load divided by kAv. In terms of the Planck formula, the relation between Ts and 
the physical temperature, T, of such a load is given by replacing Tpianck by Ts in 
Eq. (1.1). 

The system temperature consists of two parts: Tr, the receiver temperature, 
which represents the internal noise from the receiver components, plus the unwanted 
noise incurred from connecting the receiver to the antenna and from the noise 
components from the antenna produced by ground radiation, atmospheric emission, 
ohmic losses, and other sources. 

We reserve the term antenna temperature to refer to the component of the power 
received by the antenna that results from a cosmic source under study. The power 
received in an antenna from the source is [see Eq. (1.4)] 


Pa = kT, Av š (1.6) 


and Ty is related to the flux density by Eqs. (1.3) and (1.6). It is useful to express 
this relation as T, (K) = SA/2k = S (Jy) x A (m*)/2800. Astronomers sometimes 
specify the performance of an antenna in terms of janskys per kelvin, that is, the flux 
density (in units of 10-7 W m~? Hz~!), of a point source that increases T4 by one 
kelvin. Thus, this measure is equal to 2800/A (m?) Jy K7!. 

Another term that may be encountered is the system equivalent flux density, 
SEFD, which is an indicator of the combined sensitivity of both an antenna and 
receiving system. It is equal to the flux density of a point source in the main beam 
of the antenna that would cause the noise power in the receiver to be twice that of 
the system noise in the absence of a source. Equating P4 in Eq. (1.3) with kTsAv, 
we obtain 


2kTs 
SEFD = =~. (1.7) 


The ratio of the signal power from a source to the noise power in the receiving 
amplifier is 7,/Ts. Because of the random nature of the signal and noise, mea- 
surements of the power levels made at time intervals separated by (2Av)~! can be 
considered independent. A measurement in which the signal level is averaged for 
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a time T contains approximately 2Avt independent samples. The signal-to-noise 
ratio (SNR), Rsn, at the output of a power-measuring device attached to the receiver 
is increased in proportion to the square root of the number of independent samples 
and is of the form 


Ron = =c% VAT, (1.8) 


where C is a constant that is greater than or equal to one. This result (derived in 
Appendix 1.1) appears to have been first obtained by Dicke (1946) for an analog 
system. C = 1 for a simple power-law receiver with a rectangular passband and can 
be larger by a factor of ~ 2 for more complicated systems. Typical values of Av 
and t are of order 1 GHz and 6 h, which result in a value of 4 x 10° for the factor 
(Avt)!/?. As a result, it is possible to detect a signal for which the power level is 
less than 10~° times the system noise. A particularly effective use of long averaging 
time is found in the observations with the Cosmic Background Explorer (COBE) 
satellite, in which it was possible to measure structure at a brightness temperature 
level less than 10~’ of the system temperature (Smoot et al. 1990, 1992). 

The following calculation may help to illustrate the low energies involved in radio 
astronomy. Consider a large radio telescope with a total collecting area of 10* m? 
pointed toward a radio source of flux density 1 mJy (= 10~3 Jy) and accepting 
signals over a bandwidth of 50 MHz. In 10° years, the total energy accepted is 
about 1077 J (1 erg), which is comparable to a few percent of the kinetic energy in a 
single falling snowflake. To detect the source with the same telescope and a system 
temperature of 50 K would require an observing time of about 5 min, during which 
time the energy received would be about 107) J. 


1.3 Development of Radio Interferometry 


1.3.1 Evolution of Synthesis Techniques 


This section presents a brief history of interferometry in radio astronomy. As an 
introduction, the following list indicates some of the more important steps in the 
progress from the Michelson stellar interferometer to the development of multi- 
element, synthesis imaging arrays and VLBI: 


1. Michelson stellar interferometer. This optical instrument introduced the tech- 
nique of using two spaced receiving apertures, and the measurement of fringe 
amplitude to determine angular width (1890-1921). 

2. First astronomical observations with a two-element radio interferometer. Ryle 
and Vonberg (1946), solar observations. 

3. Phase-switching interferometer. First implementation of the voltage-multiplying 
action of a correlator, which is the device used to combine the signals from two 
antennas (1952). 


11. 


12. 


13. 


14. 
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. Astronomical calibration. Gradual accumulation during the 1950s and 1960s of 


accurate positions for small-diameter radio sources from optical identifications 
and other means. Observations of such sources enabled accurate calibration of 
interferometer baselines and instrumental phases. 


. Early measurements of angular dimensions of sources. Use of variable-baseline 


interferometers (~ 1952 onward). 


. Solar arrays. Development of multiantenna arrays of centimeter-wavelength 


tracking antennas that provided detailed maps and profiles of the solar disk (mid- 
1950s onward). 


. Arrays of tracking antennas. General movement from meter-wavelength, non- 


tracking antennas to centimeter-wavelength, tracking antennas. Development of 
multielement arrays with a separate correlator for each baseline (~ 1960s). 


. Earth-rotation synthesis. Introduced by Ryle with some precedents from solar 


imaging. The development of computers to control receiving systems and 
perform Fourier transforms required in imaging was an essential component 
(1962). 


. Spectral line capability. Introduced into radio interferometry (~ 1962). 
. Development of image-processing techniques. Based on phase and amplitude 


closure, nonlinear deconvolution and other techniques, as described in Chaps. 10 
and 11 (~ 1974 onward). 

Very-long-baseline interferometry (VLBI). First observations (1967). Super- 
luminal motion in active galactic nuclei discovered (1971). Contemporary 
plate motion detected (1986). International Celestial Reference Frame adopted 
(1998). 

Millimeter-wavelength instruments (~ 100-300 GHz). Major developments 
mid-1980s onward. 

Orbiting VLBI (OVLBI). U.S. Tracking and Data Relay Satellite System 
(TDRSS) experiment (1986-88). VLBI Space Observatory Programme (VSOP) 
(1997). RadioAstron (2011). 

Submillimeter-wavelength instruments (300 GHz-1 THz). James Clerk 
Maxwell Telescope—Caltech Submillimeter Observatory interferometer (1992). 
Submillimeter Array of the Smithsonian Astrophysical Observatory (SAO) and 
Academia Sinica of Taiwan (2004). Atacama Large Millimeter/submillimeter 
Array (ALMA) (2013). 


1.3.2 Michelson Interferometer 


Interferometric techniques in astronomy date back to the optical work of Michelson 
(1890, 1920) and of Michelson and Pease (1921), who were able to obtain 
sufficiently fine angular resolution to measure the diameters of some of the nearer 
and larger stars such as Arcturus and Betelgeuse. The basic similarity of the theory 
of radio and optical radiation fields was recognized early by radio astronomers, 
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Fig. 1.4 (a) Schematic diagram of the Michelson—Pease stellar interferometer. The incoming rays 
are guided into the telescope aperture by mirrors mı to m4, of which the outer pair define the two 
apertures of the interferometer. Rays a, and bı traverse equal paths to the eyepiece at which the 
image is formed, but rays a and b2, which approach at an angle 6 to the instrumental axis, traverse 
paths that differ by a distance A. (b) The intensity of the image as a function of position angle in 
a direction parallel to the spacing of the interferometer apertures. The solid line shows the fringe 
profiles for an unresolved star (Vy = 1.0), and the broken line is for a partially resolved star for 
which Vy = 0.5. 


and optical experience has provided valuable precedents to the theory of radio 
interferometry. 

As shown in Fig. 1.4, beams of light from a star fall upon two apertures and are 
combined in a telescope. The resulting stellar image has a finite width and is shaped 
by effects that include atmospheric turbulence, diffraction at the mirrors, and the 
bandwidth of the radiation. Maxima in the light intensity resulting from interference 
occur at angles 0 for which the difference A in the path lengths from the star to the 
point at which the light waves are combined is an integral number of wavelengths 
at the effective center of the optical passband. If the angular width of the star is 
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small compared with the spacing in 0 between adjacent maxima, the image of the 
star is crossed by alternate dark and light bands, known as interference fringes. If, 
however, the width of the star is comparable to the spacing between maxima, one 
can visualize the resulting image as being formed by the superposition of images 
from a series of points across the star. The maxima and minima of the fringes from 
different points do not coincide, and the fringe amplitude is attenuated, as shown in 
Fig. 1.4b. As a measure of the relative amplitude of the fringes, Michelson defined 
the fringe visibility, Vm, as 


y intensity of maxima - intensity of minima (1.9) 
Mee a ee fe Pe a . 
intensity of maxima + intensity of minima 


Note that with this definition, the visibility is normalized to unity when the intensity 
at the minima is zero, that is, when the width of the star is small compared with 
the fringe width. If the fringe visibility is measurably less than unity, the star 
is said to be resolved by the interferometer. In their 1921 paper, Michelson and 
Pease explained the apparent paradox that their instrument could be used to detect 
structure smaller than the seeing limit imposed by atmospheric turbulence. The 
fringe pattern, as depicted in Fig. 1.4, moves erratically on time scales of 10-100 ms. 
Over long averaging time, the fringes are smoothed out. However, the “jittering” 
fringes can be discerned by the human eye, which has a typical response time of 
tens of milliseconds. 

Let I(l, m) be the two-dimensional intensity of the star, or of a source in the case 
of a radio interferometer. (l, m) are coordinates on the sky, with / measured parallel 
to the aperture spacing vector and m normal to it. The fringes provide resolution 
in a direction parallel to the aperture spacing only. In the orthogonal direction, the 
response is simply proportional to the intensity integrated over solid angle. Thus, 
the interferometer measures the intensity projected onto the / direction, that is, the 
one-dimensional profile /, (7) given by 


h() = fu m) dm. (1.10) 


As will be shown in later chapters, the fringe visibility is proportional to the 
modulus of the Fourier transform of J; (/) with respect to the spacing of the apertures 
measured in wavelengths. Figure 1.5 shows the integrated profile J; for three simple 
models of a star or radio source and the corresponding fringe visibility as a function 
of u, the spacing of the interferometer apertures in units of the wavelength. At the 
top of the figure is a rectangular pillbox distribution, in the center a circular pillbox, 
and at the bottom a circular Gaussian function. The rectangular pillbox represents 
a uniformly bright rectangle on the sky with sides parallel to the / and m axes and 
width a in the / direction. The circular pillbox represents a uniformly bright circular 
disk of diameter a. When projected onto the / axis, the one-dimensional intensity 
function J; has a semicircular profile. The Gaussian model is a circularly symmetric 
source with Gaussian taper of the intensity from the maximum at the center. The 
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Fig. 1.5 The one-dimensional intensity profiles J, (/) for three simple intensity models: (a) left, 
a uniform rectangular source; (b) left, a uniform circular source; and (c) left, a circular Gaussian 
distribution. The corresponding Michelson visibility functions V y are on the right. / is an angular 
variable on the sky, u is the spacing of the receiving apertures measured in wavelengths, and a is the 
characteristic angular width of the model. The solid lines in the curves of Vy indicate the modulus 
of the Fourier transform of 7; (J), and the broken lines indicate negative values of the transform. 
See text for further explanation. Models are discussed in more detail in Sect. 10.4. 
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intensity is proportional to exp [—4 In 2(? + m?)/a’], resulting in circular contours 
and a diameter a at the half-intensity level. Any slice through the model in a plane 
perpendicular to the (l,m) plane has a Gaussian profile with the same half-height 
width, a. 

Michelson and Pease used mainly the circular disk model to interpret their 
observations and determined the stellar diameter by varying the aperture spacing 
of the interferometer to locate the first minimum in the visibility function. In the 
age before electronic instrumentation, the adjustment of such an instrument and 
the visual estimation of Vy required great care, since, as described above, the 
fringes were not stable but vibrated across the image in a random manner as a result 
of atmospheric fluctuations. The published results on stellar diameters measured 
with this method were never extended beyond the seven bright stars in Pease’s 
(1931) list; for a detailed review see Hanbury Brown (1968). However, the use 
of electro-optical techniques now offers much greater instrumental capabilities in 
optical interferometry, as discussed in Sect. 17.4. 


1.3.3 Early Two-Element Radio Interferometers 


In 1946, Ryle and Vonberg constructed a radio interferometer to investigate cosmic 
radio emission, which had been discovered and verified by earlier investigators 
(Jansky 1933; Reber 1940; Appleton 1945; Southworth 1945). This interferometer 
used dipole antenna arrays at 175 MHz, with a baseline (i.e., the spacing between 
the antennas) that was variable between 10 and 140 wavelengths (17 and 240 m). A 
diagram of such an instrument and the type of record obtained are shown in Fig. 1.6. 
In this and most other meter-wavelength interferometers of the 1950s and 1960s, the 
antenna beams were pointed in the meridian, and the rotation of the Earth provided 
scanning in right ascension. 

The receiver in Fig. 1.6 is sensitive to a narrow band of frequencies, and a 
simplified analysis of the response of the interferometer can be obtained in terms 
of monochromatic signals at the center frequency v9. We consider the signal from 
a radio source of very small angular diameter that is sufficiently distant that the 
incoming wavefront effectively lies in a plane. Let the signal voltage from the right 
antenna in Fig. 1.6 be represented by V sin(27 vot). The longer path length to the 
left antenna (as in Fig. 1.4) introduces a time delay t = (D/c) sin 0, where D is 
the antenna spacing, 0 is the angular position of the source, and c is the velocity of 
light. Thus, the signal from the left antenna is V sin[27 vo(t — t)]. The detector of 
the receiver generates a response proportional to the squared sum of the two signal 
voltages: 


{V sin(2x vot) + V sin[2avo(t — t)]}? . (1.11) 


The output of the detector is averaged in time, i.e., it contains a lowpass filter 
that removes any frequencies greater than a few hertz or tens of hertz, so in 


1.3 Development of Radio Interferometry 19 


(a) 


Dipole antenna 


gt ae i ee 


Receiver with 
square-law 
(power-linear) 
detector 


Chart 
recorder 


(b) 


Output 
voltage 


16.00 17.00 18.00 19.00 20.00 


Time (hours) 


Fig. 1.6 (a) A simple interferometer, also called an adding interferometer, in which the signals are 
combined additively. (b) Record from such an interferometer with east-west antenna spacing. The 
ordinate is the total power received, since the voltage from the square-law detector is proportional 
to power, and the abscissa is time. The source at the left is Cygnus A and the one at the 
right Cassiopeia A. The increase in level near Cygnus A results from the galactic background 
radiation, which is concentrated toward the plane of our Galaxy but is completely resolved by the 
interferometer fringes. The record is from Ryle (1952). Reproduced with permission of the Royal 
Society, London, and the Master and Fellows of Churchill College, Cambridge. © Royal Society. 


expanding (1.11), we can ignore the term in the harmonic of 27 vot. The detector 
output,” in terms of the power Po generated by either of the antennas alone, is 
therefore 


P = Po[1 + cos(2rvot)] . (1.12) 


?For simplicity, in expression (1.11), we added the signal voltages from each antenna. In practice, 
such signals must be combined in networks that obey the conservation of power. Thus, if the 
signal from each antenna is represented as a voltage source V and characteristic impedance R, the 
power available is V7/R. Combining two signals in series can be represented by a voltage 2V and 
impedance 2R, giving a power of 2V? /R. In contrast, in free space, the addition of two coherent 
electric fields of equal strength quadruples the power. This distinction is important in the discussion 
of the sea interferometer (Sect. 1.3.4). 
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Because t varies only slowly as the Earth rotates, the frequency represented by 
cos(27 vot) is not filtered out. In terms of the source position, 0, we have 


27 VoD sin 0 
P = Po | 1 + cos | —————_ . (1.13) 
c 


Thus, as the source moves across the sky, P varies between 0 and 2Pọ, as shown 
by the sources in Fig. 1.6b. The response is modulated by the beam pattern of the 
antennas, of which the maximum is pointed in the meridian. The cosine function in 
Eq. (1.13) represents the Fourier component of the source brightness to which the 
interferometer responds. The angular width of the fringes is less than the angular 
width of the antenna beam by (approximately) the ratio of the width of an antenna 
to the baseline D, which in this example is about 1/10. The use of an interferometer 
instead of a single antenna results in a corresponding increase in precision in 
determining the time of transit of the source. The form of the fringe pattern in 
Eq. (1.13) also applies to the Michelson interferometer in Fig. 1.4. In the former case 
(radio), the fringes develop as a function of time, while in the latter case (optical), 
they appear as a function of position in the pupil plane of the telescope. 


1.3.4 Sea Interferometer 


A different implementation of interferometry, known as the sea interferometer, 
or Lloyd’s mirror interferometer (Bolton and Slee 1953), was provided by a 
number of horizon-pointing antennas near Sydney, Australia. These had been 
installed for radar during World War II at several coastal locations, at elevations of 
60-120 m above the sea. Radiation from sources rising over the eastern horizon 
was received both directly and by reflection from the sea, as shown in Fig. 1.7. The 
frequencies of the observations were in the range 40-400 MHz, the middle part of 
the range being the most satisfactory because of the sensitivity of receivers there 
and because of ionospheric effects at lower frequencies and sea roughness at higher 
frequencies. The sudden appearance of a rising source was useful in separating 
individual sources. Because of the reflected wave, the power received at the peak of 
a fringe was four times that for direct reception with the single antenna, and twice 
that of an adding interferometer with two of the same antennas (see footnote 2). 
Observations of the Sun by McCready et al. (1947) using this system provided the 
first published record of interference fringes in radio astronomy. They recognized 
that they were measuring a Fourier component of the brightness distribution and 
used the term “Fourier synthesis” to describe how an image could be produced 
from fringe visibility measurements on many baselines. Observations of the source 
Cygnus A by Bolton and Stanley (1948) provided the first positive evidence of the 
existence of a discrete nonsolar radio source. Thus, the sea interferometer played 
an important part in early radio astronomy, but the effects of the long atmospheric 
paths, the roughness of the sea surface, and the difficulty of varying the physical 
length of the baseline, which was set by the cliff height, precluded further useful 
development. 
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Fig. 1.7 (a) Schematic diagram of a sea interferometer. The fringe pattern is similar to that which 
would be obtained with the actual receiving antenna and one at the position of its image in the 
sea. The reflected ray undergoes a phase change of 180° on reflection and travels an extra distance 
A in reaching the receiving antenna. (b) Sea interferometer record of the source Cygnus A at 
100 MHz by Bolton and Stanley (1948). The source rose above the horizon at approximately 
22517™. The broken line was inserted to show that the record could be interpreted in terms of a 
steady component and a fluctuating component of the source; the fluctuations were later shown to 
be of ionospheric origin. The fringe width was approximately 1.0° and the source is unresolved, 
that is, its angular width is small in comparison with the fringe width. Part (b) is reprinted by 
permission from MacMillan Publishers Ltd.: Nature, 161, 312-313, © 1948. 


1.3.5 Phase-Switching Interferometer 


A problem with the interferometer systems in both Figs.1.6 and 1.7 is that 
in addition to the signal from the source, the output of the receiver contains 
components from other sources of noise power such as the galactic background 
radiation, thermal noise from the ground picked up in the antenna sidelobes, and 


22 1 Introduction and Historical Review 


Receiver 
with square-law 
detector 


Switch- 
frequency 
generator 


Synchronous 
detector 


To recorder 


Fig. 1.8 Phase-switching interferometer. The signal from one antenna is periodically reversed in 
phase, indicated here by switching an additional half-wavelength of path into the transmission line. 


the noise generated in the amplifiers of the receiver. For all except the few strongest 
cosmic sources, the component from the source is several orders of magnitude less 
than the total noise power in the receiver. Thus, a large offset has been removed 
from the records shown in Figs. 1.6b and 1.7b. This offset is proportional to the 
receiver gain, changes in which are difficult to eliminate entirely. The resulting drifts 
in the output level degrade the detectability of weak sources and the accuracy of 
measurement of the fringes. With the technology of the 1950s, the receiver output 
was usually recorded on a paper chart and could be lost when baseline drifts caused 
the recorder pen to go off scale. 

The introduction of phase switching by Ryle (1952), which removed the 
unwanted components of the receiver output, leaving only the fringe oscillations, 
was the most important technical improvement in early radio interferometry. If V 
and V2 represent the signal voltages from the two antennas, the output from the 
simple adding interferometer is proportional to (V; + V2)?. In the phase-switching 
system, shown in Fig. 1.8, the phase of one of the signals is periodically reversed, 
so the output of the detector alternates between (V; + V2)? and (V; — V2)”. The 
frequency of the switching is a few tens of hertz, and a synchronous detector 
takes the difference between the two output terms, which is proportional to V; V2. 
Thus, the output of a phase-switching interferometer is the time average of the 
product of the signal voltages; that is, it is proportional to the cross-correlation 
of the two signals. The circuitry that performs the multiplication and averaging of 
the signals in a modern interferometer is known as a correlator: a more general 
definition of a correlator will be given later. Comparison with the output of the 
system in Fig. 1.6 shows that if the signals from the antennas are multiplied instead 
of added and squared, then the constant term within the square brackets in Eq. (1.13) 
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Fig. 1.9 Output of a phase-switching interferometer as a function of time, showing the response 
to a number of sources. From Ryle (1952). Reproduced with permission of the Royal Society, 
London, and the Master and Fellows of Churchill College, Cambridge. © Royal Society. 


disappears, and only the cosine term remains. The output consists of the fringe 
oscillations only, as shown in Fig. 1.9. The removal of the constant term greatly 
reduces the sensitivity to instrumental gain variation, and it becomes practicable 
to install amplifiers at the antennas to overcome attenuation in the transmission 
lines. This advance resulted in the use of longer antenna spacings and larger arrays. 
Most interferometers from about 1950 onward incorporated phase switching, which 
provided the earliest means of implementing the multiplying action of a correlator. 
With more modern instruments, it is no longer necessary to use phase switching 
to obtain the voltage-multiplying action, but it is often included to help eliminate 
various instrumental imperfections, as described in Sect. 7.5. 


1.3.6 Optical Identifications and Calibration Sources 


Interferometer observations by Bolton and Stanley (1948), Ryle and Smith (1948), 
Ryle et al. (1950), and others provided evidence of numerous discrete sources. 
Identification of the optical counterparts of these required accurate measurement of 
radio positions. The principal method then in use for position measurement with 
interferometers was to determine the time of transit of the central fringe using 
an east-west baseline, and also the frequency of the fringe oscillations, which is 
proportional to the cosine of the declination (see Sect. 12.1 for more details). The 
measurement of position is only as accurate as the knowledge of the interferometer 
fringe pattern, which is determined by the relative locations of the electrical centers 
of the antennas. In addition, any inequality in the electrical path lengths in the 
cables and amplifiers from the antennas to the point where the signals are combined 
introduces an instrumental phase term, which offsets the fringe pattern. Smith 
(1952a) obtained positions for four sources with rms errors as small as +20” in right 
ascension and +40” in declination and gave a detailed analysis of the accuracy that 
was attainable. The optical identification of Cygnus A and Cassiopeia A by Baade 
and Minkowski (1954a,b) was a direct result of improved radio positions by Smith 
(1951) and Mills (1952). Cygnus A proved to be a distant galaxy and Cassiopeia A 
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a supernova remnant, but the interpretation of the optical observations was not fully 
understood at the time. 

The need for absolute calibration of the antennas and receiving system rapidly 
disappeared after a number of compact radio sources were identified with optical 
objects. Optical positions accurate to ~ 1” could then be used, and observations of 
such sources enabled calibration of interferometer baseline parameters and fringe 
phases. Although it cannot be assumed that the radio and optical positions of a 
source coincide exactly, the offsets for different sources are randomly oriented. 
Thus, errors were reduced as more calibration sources became available. Another 
important way of obtaining accurate radio positions during the 1960s and 1970s 
was by observation of occultation of sources by the Moon, which is described in 
Sect. 17.2. 


1.3.7 Early Measurements of Angular Width 


Comparison of the angular widths of radio sources with the corresponding dimen- 
sions of their optical counterparts helped in some cases to confirm identifications 
as well as to provide important data for physical understanding of the emission 
processes. In the simplest procedure, measurements of the fringe amplitude are 
interpreted in terms of intensity models such as those shown in Fig. 1.5. The peak- 
to-peak fringe amplitude for a given spacing normalized to the same quantity when 
the source is unresolved provides a measure of the fringe visibility equivalent to the 
definition in Eq. (1.9). 

Some of the earliest measurements were made by Mills (1953), who used an 
interferometer operating at 101 MHz, in which a small transportable array of Yagi 
elements could be located at distances up to 10 km from a larger antenna. The signal 
from this remote antenna was transmitted back over a radio link, and fringes were 
formed. Smith (1952b,c), at Cambridge, England, also measured the variation of 
fringe amplitude with antenna spacing but used shorter baselines than Mills and 
concentrated on precise measurements of small changes in the fringe amplitude. 
Results by both investigators provided angular sizes of a number of the strongest 
sources: Cassiopeia A, the Crab Nebula, NGC4486 (Virgo A), and NGC5128 
(Centaurus A). 

A third early group working on angular widths at the Jodrell Bank Experimental 
Station,’ England, used a different technique: intensity interferometry (Jennison 
and Das Gupta 1953, 1956; Jennison 1994). Hanbury Brown and Twiss (1954) 
had shown that if the signals received by two spaced antennas are passed through 
square-law detectors, the fluctuations in the intensity that result from the Gaussian 
fluctuations in the received field strength are correlated. The degree of correlation 


3Later known as the Nuffield Radio Astronomy Laboratories, and since 1999 as the Jodrell Bank 
Observatory. 
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varies in proportion to the square of the visibility that would be obtained in a 
conventional interferometer in which signals are combined before detection. The 
intensity interferometer has the advantage that it is not necessary to preserve the 
radio-frequency phase of the signals in bringing them to the location at which 
they are combined. This simplifies the use of long baselines, which in this case 
extended up to 10 km. A VHF radio link was used to transmit the detected signal 
from the remote antenna, for measurement of the correlation. The disadvantage of 
the intensity interferometer is that it requires a high SNR, and even for Cygnus A 
and Cassiopeia A, the two highest flux-density sources in the sky, it was necessary 
to construct large arrays of dipoles, which operated at 125 MHz. The intensity 
interferometer is discussed further in Sect. 17.1, but it has been of only limited 
use in radio astronomy because of its lack of sensitivity. 

The most important result of these intensity interferometer measurements was 
the discovery that for Cygnus A, the fringe visibility for the east-west intensity 
profile falls close to zero and then increases to a secondary maximum as the 
antenna spacing is increased. Two symmetric source models were consistent with 
the visibility values derived from the measurements. These were a two-component 
model in which the phase of the fringes changes by 180° in going through the 
minimum, and a three-component model in which the phase does not change. The 
intensity interferometer gives no information on the fringe phase, so a subsequent 
experiment was made by Jennison and Latham (1959) using conventional interfer- 
ometry. Because the instrumental phase of the equipment was not stable enough to 
permit calibration, three antennas were used and three sets of fringes for the three 
pair combinations were recorded simultaneously. If ¢,., is the phase of the fringe 
pattern for antennas m and n, it is easy to show that at any instant, the combination 


$123 = $12 + 23 + G31 (1.14) 


is independent of instrumental and atmospheric phase effects and is a measure of 
the corresponding combination of fringe phases (Jennison 1958). By moving one 
antenna at a time, it was found that the phase does indeed change by approximately 
180° at the visibility minimum and therefore that the two-component model in 
Fig. 1.10 is the appropriate one. The use of combinations of simultaneous visibility 
measurements typified by Eq.(1.14), now referred to as closure relationships, 
became important about 20 years later in image-processing techniques. Closure 
relationships and the conditions under which they apply are discussed in Sect. 10.3. 
They are now integral parts of the self-calibration used in image formation (see 
Sect. 11.3). 

The results on Cygnus A demonstrated that the simple models of Fig. 1.5 are 
not generally satisfactory for representation of radio sources. To determine even the 
most basic structure, it is necessary to measure the fringe visibility at spacings well 
beyond the first minimum of the visibility function to detect multiple components, 
and to make such measurements at a number of position angles across the source. 

An early interferometer aimed at achieving high angular resolution with high 
sensitivity was developed by Hanbury Brown et al. (1955) at the Jodrell Bank 
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Fig. 1.10 Two-component model of Cygnus A derived by Jennison and Das Gupta (1953) using 
the intensity interferometer. Reprinted by permission from MacMillan Publishers Ltd.: Nature, 
172, 996-997, © 1953. 


Experimental Station. This interferometer used an offset local oscillator technique 
at one antenna that took the place of a phase switch and also enabled the frequency 
of the fringe pattern to be slowed down to within the response time of the chart 
recorder used to record the output. A radio link was used to bring the signal from 
the distant antenna. Three sources were found to have diameters less than 12” using 
spacings up to 20 km at 158 MHz observing frequency (Morris et al. 1957). During 
the 1960s, this instrument was extended to baselines of up to 134 km to achieve 
resolution of less than 1” and greater sensitivity (Elgaroy et al. 1962; Adgie et al. 
1965). The program later led to the development of a multielement, radio-linked 
interferometer known as the MERLIN array (Thomasson 1986). 


1.3.8 Early Survey Interferometers and the Mills Cross 


In the mid-1950s, the thrust of much work was toward cataloging larger numbers 
of sources with positions of sufficient accuracy to allow optical identification. The 
instruments operated mainly at meter wavelengths, where the spectrum was then 
much less heavily crowded with manmade emissions. A large interferometer at 
Cambridge used four antennas located at the corners of a rectangle 580 m east- 
west by 49 m north-south (Ryle and Hewish 1955). This arrangement provided 
both east-west and north-south fringe patterns for measurement of right ascension 
and declination. 

A different type of survey instrument was developed by Mills et al. (1958) at 
Fleurs, near Sydney, consisting of two long, narrow antenna arrays in the form of a 
cross, as shown in Fig. 1.11. Each array produced a fan beam, that is, a beam that is 
narrow in a plane containing the long axis of the array and wide in the orthogonal 
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Fig. 1.11 Simplified diagram 
of the Mills cross radio 
telescope. The cross-shaped 
area represents the apertures 
of the two antennas. 


ij Phase-switching 


receiver 


direction. The outputs of these two arrays were combined in a phase-switching 
receiver, and the voltage-multiplying action produced a power-response pattern 
equal to the product of the voltage responses of the two arrays. This combined 
response had the form of a narrow pencil beam. The two arrays had a common 
electrical center, so there were no interferometer fringes. The arrays were 457 m 
long, and the cross produced a beam of width 49 arcmin and approximately circular 
cross section at 85.5 MHz. The beam pointed in the meridian and could be steered 
in elevation by adjusting the phase of the dipoles in the north-south arm. The sky 
survey made with this instrument provided a list of more than 2,200 sources. 

A comparison of the source catalogs from the Mills cross with those from the 
Cambridge interferometer, which initially operated at 81.5 MHz (Shakeshaft et al. 
1955), showed poor agreement between the source lists for a common area of 
sky (Mills and Slee 1957). The discrepancy was found to result principally from 
the occurrence of source confusion in the Cambridge observations. When two or 
more sources are simultaneously present within the antenna beams, they produce 
fringe oscillations with slightly different frequencies, resulting from differences in 
the source declinations. Maxima in the fringe amplitude, which occur when the 
fringe components happen to combine in phase, can mimic responses to sources. 
This was a serious problem because the beams of the interferometer antennas were 
too wide, a problem that did not arise in the Mills cross, which was designed to 
provide the required resolution for accurate positions in the single pencil beam. The 
frequency of the Cambridge interferometer was later increased to 159 MHz, thereby 
reducing the solid angles of the beams by a factor of four, and a new list of 471 
sources was rapidly compiled (Edge et al. 1959). This was the 3C survey (source 
numbers, listed in order of right ascension, are preceded by 3C, indicating the 
third Cambridge catalog). The revised version of this survey (Bennett 1962, the 3C 
catalog) had 328 entries (some additions and deletions) and became a cornerstone of 
radio astronomy for the following decade. To avoid confusion problems and errors 
in flux-density distributions determined with these types of instruments as well as 
single-element telescopes, some astronomers subsequently recommended that the 
density of sources cataloged should not, on average, exceed 1 in roughly 20 times 
the solid angle of the resolution element of the measurement instrument (Pawsey 
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Fig. 1.12 Schematic diagrams of two instruments, in each of which a small antenna is moved 
to different positions between successive observations to synthesize the response that would be 
obtained with a full aperture corresponding to the rectangle shown by the broken line. The 
arrangement of two signal-multiplying correlators producing real (R) and imaginary (/) outputs 
is explained in Sect. 6.1.7. Instruments of both types, the T-shaped array (a), and the two-element 
interferometer (b), were constructed at the Mullard Radio Astronomy Observatory, Cambridge, 
England. 


1958; Hazard and Walsh 1959). This criterion depends on the slope of a source 
count vs. flux density distribution (Scheuer 1957). For a modern treatment of the 
effects of source confusion, see Condon (1974) and Condon et al. (2012). 

In the 1960s, a generation of new and larger survey instruments began to appear. 
Two such instruments developed at Cambridge are shown in Fig. 1.12. One was 
an interferometer with one antenna elongated in the east-west direction and the 
other north-south, and the other was a large T-shaped array that had characteristics 
similar to those of a cross, as explained in Sect. 5.3.3. In each of these instruments, 
the north-south element was not constructed in full, but the response with such 
an aperture was synthesized by using a small antenna that was moved in steps to 
cover the required aperture; a different position was used for each 24-h scan in right 
ascension (Ryle et al. 1959; Ryle and Hewish 1960). The records from the various 
positions were combined by computer to synthesize the response with the complete 
north-south aperture. An analysis of these instruments is given by Blythe (1957). 
The large interferometer produced the 4C (Fourth Cambridge) catalog containing 
over 4,800 sources (Gower et al. 1967). At Molonglo in Australia, a larger Mills 
cross (Mills et al. 1963) was constructed with arrays | mile long, producing a beam 
of 2.8-arcmin width at 408 MHz. The development of the Mills cross is described 
in papers by Mills and Little (1953), Mills (1963), and Mills et al. (1958, 1963). 
Crosses of comparable dimensions located in the Northern Hemisphere included 
one at Bologna, Italy (Braccesi et al. 1969), and one at Serpukhov, near Moscow in 
the former Soviet Union (Vitkevich and Kalachev 1966). 


1.3.9 Centimeter-Wavelength Solar Imaging 


A number of instruments have been designed specifically for imaging the Sun. The 
antennas were usually parabolic reflectors mounted to track the Sun, but since the 
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Fig. 1.13 (a) A linear array of eight equally spaced antennas connected by a branching network in 
which the electrical path lengths from the antennas to the receiver input are equal. This arrangement 
is sometimes referred to as a grating array, and in practice, there are usually 16 or more antennas. 
(b) An eight-element grating array combined with a two-element array to enhance the angular 
resolution. A phase-switching receiver, indicated by the multiplication symbol, is used to form the 
product of the signal voltages from the two arrays. The receiver output contains the simultaneous 
responses of antenna pairs with 16 different spacings. Systems of this general type were known as 
compound interferometers. 


Sun is a strong radio source, the apertures did not have to be very large. Figure 1.13a 
shows an array of antennas from which the signals at the receiver input are aligned 
in phase when the angle 0 between the direction of the source and a plane normal 
to the line of the array is such that £, sin 0 is an integer, where £, is the unit antenna 
spacing measured in wavelengths. This type of array is sometimes referred to as a 
grating array, since it forms a series of fan-shaped beams, narrow in the 0 direction, 
in a manner analogous to the response of an optical diffraction grating. It is useful 
only for solar observations in which all but one of the beams falls on “quiet” sky. 
Christiansen and Warburton (1955) obtained a two-dimensional image of the quiet 
Sun at 21-cm wavelength using both east-west and north-south grating arrays. 
These arrays consisted of 32 (east-west) and 16 (north-south) uniformly spaced, 
parabolic antennas. As the Sun moved through the sky, it was scanned at different 
angles by the different beams, and a two-dimensional map could be synthesized by 
Fourier analysis of the scan profiles. To obtain a sufficient range of scan angles, 
observations extending over eight months were used. In later instruments for solar 
imaging, it was generally necessary to be able to make a complete image within a 
day to study the variation of enhanced solar emission associated with active regions. 
Several instruments used grating arrays, typically containing 16 or 32 antennas and 
crossed in the manner of a Mills cross. Crossed grating arrays produce a rectangular 
matrix pattern of beams on the sky, and the rotation of the Earth enables sufficient 
scans to be obtained to provide daily maps of active regions and other features. 
Instruments of this type included crosses at 21-cm wavelength at Fleurs, Australia 
(Christiansen and Mullaly 1963), and at 10-cm wavelength at Stanford, California 
(Bracewell and Swarup 1961), and a T-shaped array at 1.9-m wavelength at Nançay, 
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France (Blum et al. 1957, 1961). These were the earliest imaging arrays with large 
numbers (~ 16 or more) of antennas. 

Figure 1.13b illustrates the principle of a configuration known as a compound 
interferometer (Covington and Broten 1957), which was used to enhance the 
performance of a grating array or other antenna with high angular resolution in one 
dimension. The system shown consists of the combination of a grating array with 
a two-element array. An examination of Fig. 1.13b shows that pairs of antennas, 
chosen one from the grating array and one from the two-element array, can be 
found for all spacings from 1 to 16 times the unit spacing £,. In comparison, the 
grating array alone provides only one to seven times the unit spacing, so the number 
of different spacings simultaneously contributing to the response is increased by a 
factor of more than two by the addition of two more antennas. Arrangements of this 
type were used to increase the angular resolution of one-dimensional scans of strong 
sources (Picken and Swarup 1964; Thompson and Krishnan 1965). By combining a 
grating array with a single larger antenna, it was also possible to reduce the number 
of grating responses on the sky (Labrum et al. 1963). Both the crossed grating arrays 
and the compound interferometers were originally operated with phase-switching 
receivers to combine the outputs of the two subarrays. In later implementations 
of similar systems, the signal from each antenna is converted to an intermediate 
frequency (IF), and a separate voltage-multiplying correlator was used for each 
spacing. This allows further possibilities in arranging the antennas to maximize the 
number of different antenna spacings, as discussed in Sect. 5.5. 


1.3.10 Measurements of Intensity Profiles 


Continuing measurements of the structure of radio sources indicated that in general, 
the intensity profiles are not symmetrical, so their Fourier transforms, and hence the 
visibility functions, are complex. This will be explained in detail in later chapters, 
but at this point, we note that it means that the phase of the fringe pattern (i.e., its 
position in time with respect to a fiducial reference), as well as the amplitude, varies 
with antenna spacing and must be measured to allow the intensity profiles to be 
recovered. To accommodate both fringe amplitude and phase, visibility is expressed 
as a complex quantity. Measurement of the fringe phase became possible in the 
1960s and 1970s, by which time a number of compact sources with well-determined 
positions, suitable for calibration of the fringe phase, were available. Electronic 
phase stability had also improved, and computers were available for recording 
and processing the output data. Improvements in antennas and receivers enabled 
measurements to be made at wavelengths in the centimeter range (frequencies 
greater than ~ 1 GHz), using tracking antennas. 

An interferometer at the Owens Valley Radio Observatory, California (Read 
1961), provides a good example of one of the earliest instruments used extensively 
for determining radio structure. It consisted of two 27.5-m-diameter parabolic 
antennas on equatorial mounts with a rail track system that allowed the spacing 
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between them to be varied by up to 490 m in both the east-west and north- 
south directions. It was used mainly at frequencies from 960 MHz to a few GHz. 
Studies by Maltby and Moffet (1962) and Fomalont (1968) illustrate the use of 
this instrument for measurement of intensity distributions, an example of which is 
shown in Fig. 1.14. Lequeux (1962) studied the structure of about 40 extragalactic 
sources at 1400 MHz on a reconfigurable two-element interferometer with baselines 
up to 1460 m (east-west) and 380 m (north-south) at Nançay Observatory in 
France. These are early examples of model fitting of visibility data, a technique 
of continuing usefulness (see Sect. 10.4). 


1.3.11 Spectral Line Interferometry 


The earliest spectral line measurements were made with single narrowband filters. 
By the early 1960s, the interferometer at Owens Valley and several others had 
been fitted with spectral line receiving systems. The passband of each receiver was 
divided into a number of channels by a filter bank, usually in the IF stages, and for 
each channel, the signals from the two antennas went to a separate correlator. In 
later systems, the IF signals were digitized and the filtering was performed digitally, 
as described in Sect. 8.8. The width of the channels should ideally be less than 
that of the line to be observed so that the line profile can be studied. Spectral line 
interferometry allows the distribution of the line emission across a radio source to 
be examined. Roger et al. (1973) describe an array in Canada built specifically for 
observations in the 1420 MHz (21-cm wavelength) line of neutral hydrogen. 
Spectral lines can also be observed in absorption, especially in the case of 
the neutral hydrogen line. At the line frequency, the gas absorbs the continuum 
radiation from any more distant source that is observed through it. Comparison of 
the emission and absorption spectra of neutral hydrogen yields information on its 
temperature and density. Measurement of absorption spectra of sources can be made 
using single antennas, but in such cases, the antenna also responds to the broadly 
distributed emitting gas within the antenna beam. The absorption spectra for weak 
sources are difficult to separate from the line emission. With an interferometer, the 
broad emission features on the sky are almost entirely resolved and the narrow 
absorption spectrum can be observed directly. For early examples of hydrogen line 
absorption measurements, see Clark et al. (1962) and Hughes et al. (1971). 


1.3.12 Earth-Rotation Synthesis Imaging 


A very important step in the development of synthesis imaging was the use of the 
variation of the antenna baseline provided by the rotation of the Earth. Figure 1.15 
illustrates this principle, as described by Ryle (1962). For a source at a high 
declination, the position angle of the baseline projected onto a plane normal to the 
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North Pole 


Axis of Earth’s 
rotation 


Fig. 1.15 Use of Earth rotation in synthesis imaging, as explained by Ryle (1962). The antennas 
A and B are spaced on an east-west line. By varying the distance between the antennas from 
one day to another, and observing for 12 h with each configuration, it is possible to encompass 
all the spacings from the origin to the elliptical outer boundary of the lower diagram. Only 12 h 
of observing at each spacing is required, since during the other 12 h, the spacings covered are 
identical but the positions of the antennas are effectively interchanged. Reprinted by permission 
from MacMillan Publishers Ltd.: Nature, 194, 517-518, © 1962. 


direction of the source rotates through 180° in 12 h. Thus, if the source is tracked 
across the sky for a series of 12-h periods, each one with a different antenna spacing, 
the required two-dimensional visibility data can be collected while the antenna 
spacing is varied in one dimension only. Calculation of two-dimensional Fourier 
transforms was an arduous task at this time. 

The Cambridge One-Mile Radio Telescope was the first instrument designed to 
exploit fully the Earth-rotation technique and apply it to a large number of radio 
sources. The use of Earth rotation was not a sudden development in radio astronomy 
and had been used in solar studies for a number of years. O’ Brien (1953) made two- 
dimensional Fourier synthesis observations with a movable-element interferometer, 
and, as noted earlier, Christiansen and Warburton (1955) had obtained a two- 
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Fig. 1.16 Contour image of the source Cygnus A, which was one of the first results (Ryle 
et al. 1965) from the Cambridge One-Mile Telescope using the Earth-rotation principle shown 
in Fig. 1.15. The frequency is 1.4 GHz. The image has been scaled in declination so that the half- 
power beam contour is circular, as shown by the shaded area in the lower right corner. The dotted 
ellipse shows the outer boundary of the optical source, and its central structure is also indicated. 
Reprinted by permission from MacMillan Publishers Ltd.: Nature, 205, 1259-1262, © 1965. 


dimensional map of the Sun, using tracking antennas in two grating arrays. At 
Jodrell Bank, Rowson (1963) had used a two-element interferometer with tracking 
antennas to map strong nonsolar sources. Also, Ryle and Neville (1962) had imaged 
the north polar region using Earth rotation to demonstrate the technique. However, 
the first images published from the Cambridge One-Mile telescope, those of the 
strong sources Cassiopeia A and Cygnus A (Ryle et al. 1965), exhibited a degree of 
structural detail unprecedented in earlier studies and heralded the development of 
synthesis imaging. The image of Cygnus A is shown in Fig. 1.16. 


1.3.13 Development of Synthesis Arrays 


Following the success of the Cambridge One-Mile Telescope, interferometers such 
as the NRAO instrument at Green Bank, West Virginia (Hogg et al. 1969), were 
rapidly adapted for synthesis imaging. Several large arrays designed to provide 
increased imaging speed, sensitivity, and angular resolution were brought into 
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Fig. 1.17 Contour image of the source Cygnus A using the Cambridge Five-Kilometre Radio 
Telescope at 5 GHz. This showed for the first time the radio nucleus associated with the central 
galaxy and the high intensity at the outer edges of the radio lobes. From Hargrave and Ryle (1974). 
© Royal Astronomical Society, used with permission. 


operation during the 1970s. Prominent among these were the Five-Kilometre Radio 
Telescope at Cambridge, England (Ryle 1972), the Westerbork Synthesis Radio 
Telescope in the Netherlands (Baars et al. 1973), and the Very Large Array (VLA) 
in New Mexico (Thompson et al. 1980; Napier et al. 1983). With these instruments, 
imaging of radio sources with a resolution of less than 1” at centimeter wavelengths 
was possible. By using na antennas, as many as na(Na— 1)/2 simultaneous baselines 
can be obtained. If the array is designed to avoid redundancy in the antenna 
spacings, the speed with which the visibility function is measured is approximately 
proportional to n2. Images of Cygnus A obtained with two of the arrays mentioned 
above are shown in Figs. 1.17 and 1.18. Resolution of the central source was 
first achieved with very-long-baseline interferometry (VLBI, see Sect. 1.3.14) 
(Linfield 1981). A more recent VLBI image is shown in Fig. 1.19. A review of the 
development of synthesis instruments at Cambridge is given in the Nobel lecture 
by Ryle (1975). An array with large collecting area, the Giant Metrewave Radio 
Telescope (GMRT), which operates at frequencies from 38 to 1420 MHz, was 
completed in 1998 near Pune, India (Swarup et al. 1991). More recently, advances 
in broadband antenna technology and large-scale integrated circuits have enabled 
further large increases in performance. For example, the capability of the VLA was 
greatly improved with an updated electronic system (Perley et al. 2009).* 


“The upgraded VLA was formally rededicated as the Karl G. Jansky Very Large Array and is 
sometimes referred to as the JVLA. 
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Fig. 1.18 Image of Cygnus A made with the VLA at 4.9 GHz. Observations with four configu- 
rations of the array were combined, and the resolution is 0.4”. The display of the image shown 
here involves a nonlinear process to enhance the contrast of the fine structure. This emphasizes 
the jet from the central galaxy to the northwestern lobe (top right) and the filamentary structure in 
the main lobes. Comparison with other records of Cygnus A in this chapter illustrates the technical 
advances made during three decades. Reproduced by permission of NRAO/AUI. From Perley et al. 
(1984). © AAS. Reproduced with permission. 
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Fig. 1.19 VLBI image of the central part of Cygnus A at 5 GHz, imaged with a ten-station global 
VLBI array. The resolution is 2 mas, and the rms noise level is 0.4 mJy/beam. The coordinates are 
centered on the core components. The knots in the jet have apparent expansion speeds of ~ 0.4 c. 
The counter jet to the left of the core is clearly visible. The jet structure is more clearly defined 
in an image at 43 GHz with a resolution of 0.15 mas by Boccardi et al. (2016). From Carilli et al. 
(1994). © AAS. Reproduced with permission. 
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During the 1980s and 1990s, synthesis arrays operating at short millimeter 
wavelengths (frequencies of 100 GHz or greater) were developed. Spectral lines are 
particularly numerous at these frequencies (see Fig. 1.2). Several considerations are 
more important at millimeter wavelengths than at centimeter wavelengths. Because 
the wavelengths are much shorter, any irregularity in the atmospheric path length 
results in a proportionately greater effect on the signal phase. Attenuation in the neu- 
tral atmosphere is much more serious at millimeter wavelengths. Also, the beams of 
the individual antennas become narrower at shorter wavelengths, and maintenance 
of a sufficiently wide field of view is one reason why the antenna diameter tends 
to decrease with increasing frequency. Thus, to obtain the necessary sensitivity, 
larger numbers of antennas are required than at centimeter wavelengths. Arrays for 
millimeter wavelengths have included those at Hat Creek, California (Welch 1994); 
Owens Valley, California (Scoville et al. 1994)°; Nobeyama, Japan (Morita 1994); 
the Plateau de Bure, France (Guilloteau 1994); and Mauna Kea, Hawaii (Ho et al. 
2004). The largest such array, the Atacama Large Millimeter/submillimeter Array 
(ALMA) consists of 50 12-m-diameter antennas in one array and 12 7-m-diameter 
antennas in another. Located in the Atacama Desert of Chile, a dry site at ~ 5, 000- 
m elevation, its operating frequency range is 31-950 GHz, and antenna spacings 
range up to 14 km. The field of view, defined by the beamwidths of the antennas, 
is only about 8” at 345 GHz in the primary array. It is an international facility and 
came into operation in 2013 (Wootten and Thompson 2009). 


1.3.14 Very-Long-Baseline Interferometry 


Investigation of the angular diameters of quasars and other objects that appear nearly 
pointlike in structure presented an important challenge throughout the early years 
of radio astronomy. An advance that led to an immediate increase of an order of 
magnitude in resolution, and subsequently to several orders more, was the use of 
independent local oscillators and signal recorders. By using local oscillators at each 
antenna that are controlled by high-precision frequency standards, it is possible to 
preserve the coherence of the signals for time intervals long enough to measure 
interference fringes. In the early years, the received signals were converted to an 
intermediate frequency low enough that they could be recorded directly on magnetic 
tape and then brought together and played into a correlator. This technique became 
known as very-long-baseline interferometry (VLBI), and the early history of its 
development is discussed by Broten (1988), Kellermann and Cohen (1988), Moran 
(1998), and Kellermann and Moran (2001). The technical requirements for VLBI 
were discussed in the USSR in the early 1960s (see, e.g., Matveenko et al. 1965). 
A successful early experiment was performed in January 1967 by a group at the 
University of Florida, who detected fringes from the burst radiation of Jupiter at 


>The arrays at Hat Creek and Owens Valley were combined at Cedar Flats, a high site east of 
Owens Valley, to form the CARMA array, which operated from 2005 to 2015. 
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18 MHz (Brown et al. 1968). Because of the strong signals and low frequency, the 
required recording bandwidth was only 2 kHz and the frequency standards were 
crystal oscillators. Much more sensitive and precise VLBI systems, which used 
wider bandwidths and atomic frequency standards, were developed by three other 
groups. In Canada, an analog recording system was developed with a bandwidth 
of 1 MHz based on television tape recorders (Broten et al. 1967). Fringes were 
obtained at a frequency of 448 MHz on baselines of 183 and 3074 km on several 
quasars in April 1967. In the United States, another group from the National Radio 
Astronomy Observatory and Cornell University developed a computer-compatible 
digital recording system with a bandwidth of 360 kHz (Bare et al. 1967). They 
obtained fringes at 610 MHz on a baseline of 220 km on several quasars in May 
1967. A third group from MIT joined in the development of the NRAO-Cornell 
system in early 1967 and obtained fringes at a frequency of 1665 MHz on a baseline 
of 845 km on several OH-line masers, with spectroscopic analysis, in June 1967 
(Moran et al. 1967). 

The initial experiments used signal bandwidths of less than a megahertz, but 
by the 1980s, systems capable of recording signals with bandwidths greater than 
100 MHz were available, with corresponding improvements in sensitivity. Real-time 
linking of the signals from remote telescopes to the correlator via a geostationary 
satellite was demonstrated (Yen et al. 1977). Also, experiments were performed in 
which the local oscillator signal was distributed over a satellite link (Knowles et al. 
1982). Neither of these satellite-supported techniques have been used significantly 
for practical and economic reasons. Most importantly, the accessibility of the world- 
wide network of fiberoptic transmission lines, which have since become available, 
allows real-time transmission of the data to the correlator. These developments, 
as well as the advent of sophisticated data analysis techniques, have lessened 
the distinction between VLBI and more conventional forms of interferometry. A 
detailed technical description of issues specific to the VLBI technique is given in 
Chap. 9. 

An early example of the extremely high angular resolution that can be achieved 
with VLBI is provided by a measurement by Burke et al. (1972), who obtained a 
resolution of 200 jzas using antennas in Westford, Massachusetts, and near Yalta 
in the Crimea, operating at a wavelength of 1.3 cm. Early measurements, obtained 
using a few baselines only, were generally interpreted in terms of the simple models 
in Fig. 1.5. Important results were the discovery and investigation of superluminal 
(apparently faster-than-light) motions in quasars (Whitney et al. 1971; Cohen et al. 
1971), as shown in Fig. 1.20, and the measurement of proper motion in H20 line 
masers (Genzel et al. 1981). During the mid-1970s, several groups of astronomers 
began to combine their facilities to obtain measurements over ten or more baselines 
simultaneously. In the United States, the Network Users’ Group, later called the U.S. 
VLBI Consortium, included the following observatories: Haystack Observatory in 
Massachusetts (NEROC); Green Bank, West Virginia (NRAO); Vermilion River 
Observatory in Illinois (Univ. of Illinois); North Liberty in Iowa (Univ. of Iowa); 
Fort Davis, Texas (Harvard College Observatory); Hat Creek Observatory, Califor- 
nia (Univ. of California); and Owens Valley Radio Observatory, California. Other 
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Fig. 1.20 VLBI images of 
the quasar 3C273 at five 
epochs, showing the relative 
positions of two components. 
From the distance of the 
object, deduced from the 
optical redshift, the apparent 
relative velocity of the 
components exceeds the 
velocity of light, but this can 
be explained by relativistic 
and geometric effects. The 
observing frequency is 

10.65 GHz. An angular scale 
of 2 mas is shown in the lower 
right corner. From Pearson 
et al. (1981). Reprinted by 
permission from MacMillan 
Publishers Ltd.: Nature, 290, 
365-368, © 1981. 
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arrays, such as the European VLBI Network (EVN) soon developed. Observations 
on such networks led to more complex models [see, e.g., Cohen et al. (1975)]. 

A problem in VLBI observations is that the use of nonsynchronized local 
oscillators complicates the calibration of the phase of the fringes. It became evident 
early on that VLBI represented an intermediate form of interferometer between 
the intensity interferometer and the perfectly stable coherent interferometer (Clark 
1968). Techniques were developed to combine coherent averaging on timescales up 
to a defined coherence time, followed by incoherent averaging. These techniques 
remain useful in VLBI at very high frequencies. To overcome the problem of 
calibration of phase for coherently averaged data, the phase closure relationship 
of Eq. (1.14) was first applied to VLBI data by Rogers et al. (1974). The technique 
rapidly developed into a method to obtain images known as hybrid mapping. For 
examples of hybrid mapping, see Figs. 1.19 and 1.20. This method was subsumed 
into the more general approach called self-calibration (see Chap. 11). For some 
spectral line observations in which the source consists of spatially isolated masers, 
the signals from which can be separated by their individual Doppler shifts, phase 
referencing techniques can be used (e.g., Reid et al. 1980). 

The first array of antennas built specifically for astronomical measurements 
by VLBI, the Very Long Baseline Array (VLBA) of the U.S. National Radio 
Astronomy Observatory (NRAO), was brought into operation in 1994. It consists of 
ten 25-m-diameter antennas, one in the U.S. Virgin Islands, eight in the continental 
United States, and one in Hawaii (Napier et al. 1994). The VLBA is often linked 
with additional antennas to further improve the baseline coverage and sensitivity. 
Figure 1.21 presents a result from the combined VLBA and EVN array. 

The great potential of VLBI in astrometry and geodesy was immediately 
recognized after the initial experiments in 1967 [see, e.g., Gold (1967)]. A seminal 
meeting defining the role of VLBI in Earth dynamics programs was held in 
Williamstown, Massachusetts, in 1969 (Kaula 1970). The use of VLBI in these 
applications developed rapidly during the 1970s and 1980s; see, for example, 
Whitney et al. (1976) and Clark et al. (1985). In the United States, NASA and several 
other federal agencies set up a cooperative program of geodetic measurements in the 
mid-1970s. This work evolved in part from the use of the Jet Propulsion Laboratory 
deep-space communications facilities for VLBI observations. It has expanded into 
an enormous worldwide effort carried out under the aegis of the International VLBI 
Service (IVS) and a network of more than 40 antennas. An important result of this 
effort has been the establishment of the International Celestial Reference Frame 
adopted by the IAU, which is based on 295 “defining” sources whose positions are 
known to an accuracy of about 40 jas (Fey et al. 2015). Another striking result of 
the geodetic VLBI work has been the detection of contemporary plate motions in 
the Earth’s mantle, first measured as a change in the Westford—Onsala baseline at 
a rate of 17 + 2 mm/yr (Herring et al. 1986). The VLBI measurements of plate 
motions is shown in Fig. 1.22. Astrometry with submilliarcsecond accuracy has 
opened up new possibilities in astronomy, for example, the detection of the motion 
of the Sun around the Galactic center from the proper motion of Sagittarius A* 
(Backer and Sramek 1999; Reid and Brunthaler 2004) and measurements of the 
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Fig. 1.21 Image of the gravitational lens source MG J0751+2716 made with a 14-h observation 
on a 2l-element global VLBI Array (VLBA and the EVN plus the Green Bank telescope) at a 
frequency of 1.7 GHz. The rms noise level is 12 Jy, and the resolution is 2.2 x 5.6 mas. This 
image of an extended background source at redshift 3.2 is highly distorted by an unseen foreground 
radio-quiet galaxy at a redshift of 0.35 into extended arcs. Image courtesy of and © John McKean. 


annual parallaxes of galactic radio sources (Reid and Honma 2014). Astrometric 
and geodetic methods are described in Chap. 12. 

The combination of VLBI with spectral line processing is particularly effective 
in the study of problems that involve both astrometry and dynamical analysis of 
astronomical systems. The galaxy NGC4258, which exhibits an active galactic 
nucleus, has been found to contain a number of small regions that emit strongly in 
the 22.235-GHz water line as a result of maser processes. VLBI observations have 
provided an angular resolution of 200 pas, an accuracy of a few microarcseconds 
in the relative positions of the masers, and measurements of Doppler shifts to an 
accuracy of 0.1 kms! in radial velocity (see Fig. 1.23). NGC4258 is fortuitously 
aligned so that the disk is almost edge-on as viewed from the Earth. The orbital 
velocities of the masers, which obey Kepler’s law, are accurately determined as a 
function of radius from the center of motion. Hence, the distance can be found by 
comparing the linear and angular motions. The angular motions are about 30 jas 
per year. These results provide a value for the central mass of 3.9 x 107 times 
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Fig. 1.22 Tectonic plate motions measured with VLBI. A VLBI station is located at the foot of 
each vector and labeled by the station name. The sum of the motion vectors is constrained to 
be zero. The largest motion is for the Kokee site in Hawaii, about 8 cm yr. Plate boundaries, 
established by other techniques, are shown by the jagged lines. From Whitney et al. (2014). 
Reprinted with permission courtesy of and © MIT Lincoln Laboratory, Lexington, MA. 


the mass of the Sun, presumably a supermassive black hole (Miyoshi et al. 1995; 
Herrnstein et al. 1999), and 7.6 + 0.2 Mpc for the distance (Humphreys et al. 2013). 
The uncertainty of 3% in the distance of an extragalactic object, measured directly, 
set a precedent. 


1.3.15 VLBI Using Orbiting Antennas 


The use of spaceborne antennas in VLBI observations is referred to as the OVLBI 
(orbiting VLBI) technique. The first observations of this type were made in 1986 
using a satellite of the U.S. Tracking and Data Relay Satellite System (TDRSS). 
These satellites were in geostationary orbit at a height of approximately 36,000 km 
and were used to relay data from low-Earth-orbit spacecraft to Earth. They carried 
two 4.9-m antennas used to communicate with other satellites at 2.3 and 15 GHz 
and a smaller antenna for the space-to-Earth link. In this experiment, one of the 4.9- 
m antennas was used to observe a radio source, and the other received a reference 
signal from a hydrogen maser on the ground (Levy et al. 1989). The received signals 
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Fig. 1.23 Image of the water vapor maser disk in the core of the galaxy NGC4258 at 1.35 cm 
made with the VLBA. The spots mark the positions of the unresolved maser components. The 
elliptical grid lines denote the thin, slightly warped disk that the masers trace. The position of the 
gravitational center is shown by the black square. The contour plot shows the continuum emission 
from the central active galactic nucleus. Each maser spot corresponds to a feature in the spectrum 
in the lower panel. The strongest feature, at 470 km s~!, serves as a phase reference. The inset 
shows the radial velocity of the masers vs. radial distance from the black hole in milliarcseconds. 
From Herrnstein et al. (2005). © AAS. Reproduced with permission. 


were transmitted to the ground and recorded on a VLBI tape system for correlation 
with signals from ground-based antennas. The numbers of sources detected were 
23 and 11 at 2.3 and 15 GHz, respectively (Linfield et al. 1989, 1990). At 15 GHz, 
the fringe width was of order 0.3 mas, and interpretation of the results in terms of 
circular Gaussian models indicated brightness temperatures as high as 2 x 10! K. 

VLBI observations using a satellite in a non-geostationary orbit were first made 
in 1997 by the VLBI Space Observatory Programme (VSOP) (Hirabayashi et al. 
1998), designed specifically for VLBI observations. It was equipped with an antenna 
of 8-m diameter, and observations were made at 1.6 and 5 GHz. The orbital period 
was approximately 6.6 h and the apogee height, 21,000 km. VSOP was followed 
by the RadioAstron satellite, which was launched in 2011 into an orbit with an 
apogee height of about 300,000 km and a period of 8.3 days (Kardashev et al. 
2013). It is equipped with an antenna of 10-m diameter and receivers at 18, 6, and 
1.35 cm. Operating with ground-based telescopes, it can attain a resolution of 8 uas 
at 1.35 cm. More information about satellite VLBI can be found in Sect. 9.10. 
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The possibility of achieving very long baselines by reflection from the Moon has 
been discussed by Hagfors et al. (1990). Reflection from the surface of the Moon 
could provide baselines up to a length approaching the radius of the lunar orbit. An 
antenna of 100-m aperture, or larger, would be used to track the Moon and receive 
the reflected signal from the source under study, and a smaller antenna could be 
used for the direct signal. It is estimated that the sensitivity would be about three 
orders of magnitude less than would be obtained by observing the source directly 
with both antennas. Further complications result from the roughness of the lunar 
surface and from libration. The technique could be useful for special observations 
requiring very high angular resolution of strong sources, for example, for the burst 
radiation from Jupiter. However, RadioAstron provides baselines almost as long. 


1.4 Quantum Effect 


The development of VLBI introduced a new facet into the apparent paradox in 
the quantum-mechanical description of interferometry (Burke 1969). The radio 
interferometer is the analog of Young’s two-slit interference experiment. It is well 
known (Loudon 1973) that a single photon creates an interference pattern but that 
any attempt to determine which slit the photon entered will destroy the interference 
pattern; otherwise, the uncertainty principle would be violated. Consideration of 
VLBI suggests that it might be possible to determine at which antenna a particular 
photon arrived, since its signature is captured in the medium used for transmission to 
the correlator as well as in the fringe pattern generated during correlation. However, 
in the radio frequency range, the input stages of receivers used as the measurement 
devices consist of amplifiers or mixers that conserve the received phase in their 
outputs. This allows formation of the fringes in subsequent stages. The response 
of such devices must be consistent with the uncertainty principle, AEAt ~ h/2z7, 
where AE and At are the uncertainties in signal energy and measurement time. This 
principle can be written in terms of uncertainty in photon number, AN,, and phase, 
Ad, as 


AN, Ad =~ 1, (1.15) 


where AN, = AE/hv and Ad = 2nvAt. To preserve phase, Ad must be small, 
so AN, must be correspondingly large, and there must be an uncertainty of at least 
one photon per unit bandwidth per unit time in the output of the receiving amplifier. 
Hence, the SNR is less than unity in the single-photon limit, and it is impossible to 
determine at which antenna a single photon entered. An alternative but equivalent 
statement is that the output of any receiving system must contain a noise component 
that is not less than an equivalent input power approximately equal to hv per unit 
bandwidth. 

The individual photons that constitute a radio signal arrive at antennas at random 
times but with an average rate that is proportional to the signal strength. For 
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phenomena of this type, the number of events that occur in a given time interval 
T varies statistically in accordance with the Poisson distribution. For a signal power 
Pig, the average number of photons that arrive within time t is N, = Pyigt/hv. The 
rms deviation of the number arriving during a series of intervals t is, for Poisson 


statistics, given by AN, = VNp. From Eq. (1.15), the resulting uncertainty in the 


A ae a (1.16) 
Pa l 
p 


We can also express the uncertainty in the measurement of the signal phase in terms 
of the noise that is present in the receiving system. The minimum noise power, 
Phoise, is approximately equal to the thermal noise from a matched resistive load at 
temperature hv/k, that is, P poise = hv Av. The uncertainty in the phase, as measured 
with an averaging time t, becomes 


Proise 
Ag = M : 1.17 
o Psigt Av ( ) 


Note that A¢ is the accuracy with which the phase of the amplified signal received 
from one antenna can be measured: for example, in Doppler tracking of a spacecraft 
(Cannon 1990). This is not to be confused with the accuracy of measurement of 
the fringe phase of an interferometer. For a frequency v = 1 GHz, the effective 
noise temperature hv/k is equal to 0.048 K. Thus, for frequencies up to some 
tens of gigahertz, the quantum effect noise makes only a small contribution to 
the receiver noise. At 900 GHz, which is generally considered to be about the 
high frequency limit for ground-based radio astronomy, hv/k = 43 K, and the 
contribution to the system noise is becoming important. In the optical region, 
v ~ 500 THz, hv/k ~ 30, 000 K, and heterodyne systems are of limited practicality, 
as discussed in Sect. 17.6.2. However, in the optical region, it is possible to build 
“direct detection” devices that detect power without conserving phase, so A@ in 
Eq. (1.17) effectively tends to infinity, and there is no constraint on the measurement 
accuracy of the number of photons. Thus, most optical interferometers form fringes 
directly from the light received and measure the resulting patterns of light intensity 
to determine the fringe parameters. 

For further reading on the general subject of thermal and quantum noise, see, 
for example, Oliver (1965) and Kerr et al. (1997). Nityananda (1994) compares 
quantum issues in the radio and optical domains, and a discussion of basic concepts 
is given by Radhakrishnan (1999). 


signal phase is 


46 1 Introduction and Historical Review 


Appendix 1.1 Sensitivity of Radio Astronomical Receivers 
(the Radiometer Equation) 


An idealized block diagram of the basic receiver configuration widely used in 
radio astronomy is shown in Fig. Al.1. We describe its function and analyze its 
performance in this appendix. The signal from an antenna is first passed through 
an amplifier. The amplifier is characterized by its power gain factor, G; receiver 
temperature; and the bandwidth, Av. The gain factor is assumed to be constant. If 
the gain is sufficiently high, this amplifier sets the noise performance of the entire 
system, which we denote as Ts to include the contributions from atmosphere, ground 
pickup, and ohmic losses. We assume that the passband has a rectangular shape that 
is flat between a lower cutoff frequency, vo, and the upper cutoff frequency, vo + Av. 
The signal then passes through a mixer, where it is multiplied by a sinusoidal local 
oscillator signal at frequency vo and is converted to a baseband from 0 to Av. 
In the next stage, the signal is converted to a digital data stream sampled at the 
Nyquist rate. According to the Nyquist sampling theorem, a bandlimited signal 
can be represented by samples taken at intervals of 1/2Av. We assume there is 
no quantization error in this sampling process. In this case, the original signal can 
be exactly reconstructed from the sampled sequence by convolution with a sinc 
function. The sampled signal has the same statistical properties as the corresponding 
analog signal. The next step is a square-law detector, which squares the amplitudes 
of the signal samples. This is followed by an averager, which simply averages N 
samples in a running mean fashion. A system with these features is known as 
a single-sideband superheterodyne receiver (Armstrong 1921) or, simply, a total- 
power radiometer. Early interferometer receivers were a variation on this basic 
design (see Fig. 1.6): Signals from two antennas were added after the mixing stage 
before entering the square-law detector, and there was no signal digitization. 


Fig. A1.1 A block diagram of an idealized radiometer used in most radio astronomical systems 
for measuring total power. The system temperature Ts includes the receiver temperature Tp plus 
all unwanted additive contributions (e.g., ohmic loses, atmospheric effects, ground pickup). In 
practice, at very low frequencies (< 100 MHz), downconversion may be omitted, while at high 
frequencies (more than a few gigahertz), multiple stages of frequency downconversion are required. 
At very high frequencies (> 100 GHz), where low noise amplifiers are not available, the first stage 
is usually the mixer. In this case, its losses and those of the amplifiers following it contribute to Ts. 
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The statistical performance of the idealized system in Fig. Al.1 can be readily 
evaluated. The power level at any point in the system can be characterized by a 
temperature T according to the Nyquist relation [e.g., Eq. (1.4)] 


P=kTAvG, (A1.1) 


where we have included the effect of power amplification by the gain factor G. The 
voltage vı is a combination of antenna input, characterized by T4, and the additive 
system noise, Ts. We assume the cosmic input signal has a flat spectrum over the 
baseband frequency range. Hence, v; is a zero-mean Gaussian random process with 
a flat spectrum, i.e., a white noise spectrum. Such a process, described by p(v), 
has only one parameter, the variance, 07. The odd moments of the probability 
distribution are zero, and the even moments are 


(v") = (1-3-5-... -n)o”. (A1.2) 
The expectations of v; and v? (the power) are therefore 


(vı) =0, (A1.3) 
(vi) = k(Ts + TA) AVG. (A1.4) 


The statistics of the sampled signal, vis, and the analog signal v; are the same, i.e., 
Vis = U1, Ve = vj, etc. The characteristics of v2 are 


(v2) = (vz) , (A1.5) 
(v3) = (vf) = 30y, (A1.6) 
o3 = (v3) — (v2)? = 2(v7)”. (A1.7) 


The averager averages N = 2Avt samples together, where t is the integration time. 
Hence, 


(v3) = (vj) = k(Ts + Ta) AVG, (A1.8) 
2 2 
2 03 2[k(Ts + Ta) Av G] 
= — = ——— ~. Al. 
73 N 2Avt (al) 


v3 is converted from a power scale to a temperature scale by inserting a thermal 
noise signal of known temperature T, in order to remove or calibrate the kAv G 
factor. Formally, the calibrated version of v3 is 


U3 


= Al.1 
3v3 /ƏTa ` ( °) 


UT 
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where (0v3/07,4)~! is the conversion factor from power to temperature written as a 
partial derivative. The mean and rms of the output in temperature units are therefore 


(ur) = Tst Tas (A1.11) 
T+T, 

ee. (A1.12) 
Avt 


It is important to note that the factor of two in the expression for ož in Eq. (A1.8) 
cancels the factor of two in the number of samples averaged. The signal-to-noise 
ratio (SNR) is therefore 


~ Ts +Ta 


Ren VAt. (A1.13) 


Equation (A1.13) shows that T4 contributes to the fluctuations, and in the limit 
Ta © Ts, longer integration does increase the SNR. For T, < Ts, the usual case, 
Eq. (A1.13) becomes Eq. (1.8). Because of the fundamental limitation imposed by 
the Nyquist sampling theorem, no receiver system can perform better than specified 
by Eq. (A1.13). The performance of any other system can be written as 


Ts + Ta 


or =C ; 
j AVT 


(A1.14) 


where C is a factor equal to, or greater than, one. The square-law detector could be 
replaced by another type of detector. For a linear detector, i.e., v2 = |v1|, a similar 
analysis to the one for the square-law detector yields C = vym —2 = 1.07 when 
Ta < Ts. In this calculation, it is necessary to linearize the output by calculation of 
v3/0T, in Eq. (A1.10). For a fourth-order detector, vz = vf, C = WETE) = 1.15. 
More details can be found in Davenport and Root (1958). 

G may not be a constant but can vary randomly due to electronic instabilities. In 
that case, a synchronous detector is added and receiver input is switched between the 
antenna and a reference signal. This system is known as a Dicke (1946) radiometer. 
(Note that the phase-switching interferometer [see Fig. 1.8] uses the synchronous 
detection principle.) The noise performance of a Dicke radiometer is worse by 
a factor of two, but the effects of gain fluctuations are mitigated. An alternative 
receiver that reduces the effects of gain fluctuations is called the correlation receiver, 
in which the signal from the antenna is divided in half and passed through separate 
amplifiers before being multiplied, where the multiplier replaces the square-law 
detector. 

In older receivers, there was usually no digitization before the final averaging 
stage. The performance of a comparable analog system is identical to that described 
above. For analog analysis of radiometers, see Tiuri (1964) or Kraus (1986). A 
summary of the performance of various receiver types is given in Table Al.1. 
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Table Al.1 Sensitivity characteristics of 
various types of receivers 


Receiver type Cc 
Total power (v2 = v?) 1 
Linear detector (vz = |v?|) 1.07° 
Fourth-order detector (v2 = vp) 1.15? 
Dicke-switched receiver 2 
Correlation receiver J/2 b 


è C is defined in Eq. (A1.14). 
7 For Ta < Ts. 


There are two major differences between radio and optical systems. Radio 
systems are characterized by Gaussian noise characteristics of both the signal and 
the additive receiver noise, whereas optical detectors are limited by Poisson statistics 
appropriate for counting photons and the SNR, Rg, is 1/ VNp , where N, is the 
number of photons. In terms of quantum mechanics, the Gaussian noise corresponds 
to photo bunching noise [see Radhakrishnan (1999)]. 
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Chapter 2 
Introductory Theory of Interferometry 
and Synthesis Imaging 


In this chapter, we provide a simplified analysis of interferometry and introduce 
several important concepts. We first consider an interferometer in one dimension 
and discuss the effect of finite bandwidth and show how the interferometer response 
can be interpreted as a convolution. We extend the analysis to two dimensions 
and discuss circumstances in which three-dimensional imaging can be undertaken. 
This chapter is intended to provide a broad introduction to the principles of 
synthesis imaging to facilitate the understanding of more detailed development in 
later chapters. A brief introduction to the theory of Fourier transforms is given in 
Appendix 2.1. 


2.1 Planar Analysis 


The instantaneous response of a radio interferometer to a point source can most sim- 
ply be analyzed by considering the signal paths in the plane containing the electrical 
centers of the two interferometer antennas and the source under observation. For an 
extended observation, it is necessary to take account of the rotation of the Earth and 
consider the geometric situation in three dimensions, as can be seen from Fig. 1.15. 
However, the two-dimensional geometry is a good approximation for short-duration 
observations, and the simplified approach facilitates visualization of the response 
pattern. 

Consider the geometric situation shown in Fig. 2.1, where the antenna spacing 
is east-west. The two antennas are separated by a distance D, the baseline, and 
observe the same cosmic source, which is in the far field of the interferometer; 
that is, it is sufficiently distant that the incident wavefront can be considered to be 
a plane over the distance D. The source will be assumed for the moment to have 
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Fig. 2.1 Geometry of an elementary interferometer. D is the interferometer baseline. 


infinitesimal angular dimensions. For this discussion, the receivers will be assumed 
to have narrow bandpass filters that pass only signal components very close to v. 

As explained for the phase-switching interferometer in Chap. 1, the signal 
voltages are multiplied and then time-averaged, which has the effect of filtering 
out high frequencies. The wavefront from the source in direction @ reaches the right 
antenna at a time 


D 
Tg = zsino (2.1) 


before it reaches the left one. T, is called the geometric delay, and c is the velocity of 
light. Thus, in terms of the frequency v, the output of the multiplier is proportional to 


F = 2sin(2rvt) sin 22 v(t — Tg) 
= 2 sin? (27 vt) cos(27T VT) — 2 sin(2x vt) cos(2x vt) sin(27 Vtg) . (2.2) 


The center frequency of the receivers is generally in the range of tens of megahertz 
to hundreds of gigahertz. As the Earth rotates, the most rapid rate of variation of 
6 is equal to the Earth’s rotational velocity, which is of the order of 1074 rad s™!. 
Also, because D cannot be more than, say, 10’ m for terrestrial baselines, the rate of 
variation of vt, is smaller than that of vt by at least six orders of magnitude. For an 
averaging period T >> 1/v, the average value of sin?(27vt) = 5 and the average 
value of sin(27 vt) cos(27 vt) = 0, leaving the fringe function 


2nDI 
F = cos 2T VT = cos 7 ` (2.3) 


where / = sin 8; the definition of the variable / is discussed further in Sect. 2.4. 
For sidereal sources, the variation of 0 with time as the Earth rotates generates 
quasisinusoidal fringes at the correlator, which are the output of the interferometer. 
Figure 2.2 shows an example of this function, which can be envisaged as the 
directional power reception pattern of the interferometer for the case in which the 
antennas either track the source or have isotropic responses and thus do not affect 
the shape of the pattern. 
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Fig. 2.2 Polar plot to illustrate the fringe function F = cos(27DI/A). The radial component is 
equal to |F|, and 6 is measured with respect to the vertical axis. Alternate lobes correspond to 
positive and negative half-cycles of the quasi-sinusoidal fringe pattern, as indicated by the plus and 
minus signs. To simplify the diagram, a very low value of 3 is used for D/A. The increase in fringe 
width due to foreshortening of the baseline as |6| increases is clearly shown. The maxima in the 
horizontal direction (0 = +90°) are a result of the arbitrary choice of an integer value for D/A. 


An alternate and equivalent way of envisaging the formation of the sinusoidal 
fringes is to note that because of the rotation of the Earth, the two antennas have 
different components of velocity in the direction of the source. The signals reaching 
the antennas thus suffer different Doppler shifts. When the signals are combined in 
the multiplying action of the receiving system, the sinusoidal output arises from the 
beats between the Doppler-shifted signals. 

A development of the simple analysis can be made if we consider two Fourier 
components of the received signal at frequencies vı and v2. These frequency 
components are statistically independent so that the interferometer output is the 
linear sum of the responses to each component. Hence, the output has components 
Fı and F», as in Eq. (2.3). For frequency v2, the coefficient 27 D/A = 27D v2/c will 
be different from that for vı, so Fz will have a different period from F at any given 
angle 0. This difference in period gives rise to interference between F} and F3, so 
that the fringe maxima have superimposed on them a modulation function that also 
depends on 6. Similar effects occur in the case of a continuous band of frequencies. 
For example, if the signals at the correlator are of uniform power spectral density 
over a band of width Av and center frequency vo, the output becomes 


1 vot Av/2 2xDi 
F() = — cos ( £ *) dv 
Av vo—Av/2 


II 


c 
EA 2zDlvo\ sin(xDLIAv/c) l (2.4) 
c xDLAv/c 


Thus, the fringe pattern has an envelope in the form of a sinc function [sinc(x) = 
(sin zx)/zx]. This is an example of the general result, to be discussed in the 
following section, that in the case of uniform power spectral density at the antennas, 
the envelope of the fringe pattern is the Fourier transform of the instrumental 
frequency response. 
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2.2 Effect of Bandwidth 


Figure 2.3 shows an interferometer of the same general type as in Fig. 2.1 but with 
the amplifiers Hı and H3, the multiplier, and an integrator (with respect to time) 
shown explicitly. An instrumental time delay q; is inserted into one arm. Assume 
that for a point source, each antenna delivers the same signal voltage V(t) to the 
correlator, and that one voltage lags the other by a time delay t = Tg — Ti, as 
determined by the baseline D and the source direction 0. The integrator within the 
correlator has a time constant 27; that is, it sums the output from the multiplier 
for 2T seconds and then resets to zero after the sum is recorded. The output of the 
correlator may be a voltage, a current, or a coded set of logic levels, but in any case, 
it represents a physical quantity with the dimensions of voltage squared. 


Output 


Fig. 2.3 Elementary interferometer showing bandpass amplifiers H; and H3, the geometric time 
delay t,, the instrumental time delay t;, and the correlator consisting of a multiplier and an 
integrator. 
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The output from the correlator resulting from a point source! is 


1 T 
= — | V@V(t—t)d. 25 
r= z] vove-o (2.5) 
We have ignored system noise and assumed that the two amplifiers have identical 
bandpass characteristics, including finite bandwidths Av outside which no frequen- 
cies are admitted. The integration time 27 is typically milliseconds to seconds, that 
is, very much larger than Av~!. Thus, Eq. (2.5) can be written as 


T 
r(t) = jim F V(V(t—t) dt, (2.6) 
a -T 


which is an (unnormalized) autocorrelation function. The condition T — œo is 
satisfied if a large number of variations of the signal amplitude, which have a 
duration ~ AvT!, occur in time 2T. The integration time used in practice must 
clearly be finite and much less than the fringe period. 

As described in Chap. 1, the signal from a natural cosmic source can be 
considered as a continuous random process that results in a broad spectrum, of 
which the phases are a random function of frequency. It will be assumed for our 
immediate purpose that the time-averaged amplitude of the cosmic signal in any 
finite band is constant with frequency over the passband of the receiver. 

The squared amplitude of a frequency spectrum is known as the power density 
spectrum, or power spectrum. The power spectrum of a signal is the Fourier 
transform of the autocorrelation function of that signal. This statement is known 
as the Wiener-Khinchin relation (see Appendix A2.1.5) and is discussed further in 
Sect. 3.2. It applies to signals that are either deterministic or statistical in nature and 
can be written 


lHo)? = f o r(e Pdr , (2.7) 
and 
r(t) = i > Ho) d , (2.8) 


where H(v) is the amplitude (voltage) response, and hence |H(v)|? is the power 
spectrum of the signal input to the correlator. In this case, because the cosmic 
signal is assumed to have a spectrum of constant amplitude, the spectrum H (v) 
is determined solely by the passband characteristics (frequency response) of the 
receiving system from the outputs of the antennas to the output of the integrator. 
Thus, the output of the interferometer as a function of the time delay t is the 


lFor simplicity, we consider only the signals from a point source, which are identical except for 
a time delay. In practical systems, the input waveforms at the correlator may contain the partially 
correlated signals from a partially resolved source as well as instrumental noise. 
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Fourier transform of the power spectrum of the cosmic signal as bandlimited by the 
receiving system. Assume, as a simple example, a Gaussian passband centered at vo: 


1 (v — vo)? (v + vo)? 
PN pe fex | = | + exp | = } ; (2.9) 


where o is the bandwidth factor (the full bandwidth at half-maximum level is 
V81n20). Note that to perform the Fourier transforms in Eqs. (2.7) and (2.8), 
we include a negative frequency response centered on —vo. The spectrum is then 
symmetrical with respect to zero frequency, which is consistent with the fact that 
the autocorrelation function (which is the Fourier transform of the power spectrum) 
is real. The negative frequencies have no physical meaning but arise mathematically 
from the use of the exponential function. The interferometer response is 


Ho)? = 


r(t) = gree cos(27voT) , (2.10) 


which is illustrated in Fig. 2.4a. Note that r(t) is a cosinusoidal function multiplied 
by an envelope function, in this case a Gaussian, whose shape and width depend on 
the amplifier passband. This envelope function is referred to as the delay pattern or 
bandwidth pattern. 

By setting the instrumental delay t; to zero and substituting for the geometric 
delay tg = (D/c) sin @ in Eq. (2.10), we obtain the response 


2 
r(t,) = exp |- (= sin 6) l cos (= sin 0) (2.11) 
Cc 


The period of the fringes (the cosine term) varies inversely as the quantity 
voD/c = D/A and does not depend on the bandwidth parameter o. The width 
of the bandwidth pattern (the exponential term), however, is a function of both 
o and D; wide bandwidths and long baselines result in narrow fringe envelopes. 
This result is quite general. For example, a rectangular amplifier passband of 
width Av, as considered in Eq. (2.4), results in an envelope pattern of the form 
[sin(a Avt)]|/(a Avt), as shown in Fig. 2.4b. 

In imaging applications, it is usually desirable to observe the fringes in the 
vicinity of the maximum of the pattern, where the fringe amplitude is greatest. This 
condition can be achieved by changing the instrumental delay t; continuously or 
periodically so as to keep T = Tg — T; suitably small. If t; is adjusted in steps of 
the reciprocal of the center frequency” vo, the response remains cosinusoidal with 
Tg. Note that for wide bandwidths, as Av approaches v, the width of the envelope 
function becomes so narrow that only the central fringe remains. This occurs mainly 
in optics, where a central fringe of this type is often called the “white light” fringe. 


?This adjustment method is useful to consider here, but more commonly used methods are 
described in Sects. 7.3.5 and 7.3.6. 
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Fig. 2.4 Point-source 
response of an interferometer 
with (a) Gaussian and (b) 
rectangular passbands. The 
abscissa is the geometric 
delay t. The bandwidth 
pattern determines the 
envelope of the fringe term. 
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2.3 One-Dimensional Source Synthesis 


In the analysis of an interferometer in which the antennas and the instrumental delay 
track the position of the source, as is the norm for frequencies above ~ | GHz, it is 
convenient to specify angles of the antenna beam and other variables with respect to 
a reference position on the sky, usually the center or nominal position of the source 
under observation. This is commonly referred to as the phase reference position. 
Since the range of angles required to specify the source intensity distribution relative 
to this point is generally no more than a few degrees, small-angle approximations 
can be used to advantage. The instrumental delay is constantly adjusted to equal the 
geometric delay for radiation from the phase reference position. If we designate this 
reference position as the direction 0o, then t; = (D/c) sin 6o. For radiation from a 
direction (6) — 40), where 40 is a small angle, the fringe response term is 


D 
cos(27m vot) = cos farw E sin(@) — 40) — «| 
c 
~ cos[2x vo(D/c) sin AO cos 0o] (2.12) 


for cos AO ~ 1. When observing a source at any position in the sky, the angular 
resolution of the fringes is determined by the length of the baseline projected onto 
a plane normal to the direction of the source. In Fig. 2.1, for example, this is the 
distance designated D cos 6. We therefore introduce a quantity u that is equal to the 
component of the antenna spacing normal to the direction of the reference position 
Oo. u is measured in wavelengths, À, at the center frequency vo, that is, 


_ Dcos Oo _ voD cos Oo 
À E c i 
Since 40 in Eq. (2.12) is small, we can assume that the bandwidth pattern is near 


maximum (unity) in the direction 0) — 40. Then, from Eqs. (2.12) and (2.13), the 
response to radiation from that direction is proportional to 


u 


(2.13) 


F(D) = cos(27 vot) = cos(2zul) , (2.14) 


where / = sin 40. This is the response to a point source at 0 = 0) — A@ of an 
interferometer whose net delay Tẹ — T; is zero at 9 = 69. As we shall show, the 
quantity u is interpreted as spatial frequency. It can be measured in cycles per radian, 
since the spatial variable /, being small, can be expressed in radians. 


2.3.1 Interferometer Response as a Convolution 


The response of a single antenna or an interferometer to a source can be expressed 
in terms of a convolution. Consider first the response of a single antenna and a 
receiver that measures the power received. Figure 2.5 shows the power reception 
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Fig. 2.5 The power pattern 
A(@) of an antenna pointed in 
the direction OC, and the 
intensity profile of a source 
1,(0’), used to illustrate the 
convolution relationship. The 
angle 0 is measured with 
respect to the beam center 
OC. The profile of the source 
is a function of 6’, measured 
with respect to the direction 
of the nominal position of the 
source OB. 


O 


pattern of the antenna A(6), which is a polar plot of the effective area of the antenna 
as a function of angle from the center of the antenna beam. Also shown is the one- 
dimensional intensity profile of a source /,(6’), as defined in Eq. (1.9), in which 
9’ is measured with respect to the center, or nominal position, of the source. The 
component of the output power in bandwidth Av contributed by each element d0” 
of the source is 5AvA(6’ — 9)1,(6’)d6’, where the factor i takes account of the 
ability of the antenna to respond to only one component of randomly polarized 
radiation. The total output power from the antenna, omitting the constant factor 
Av, is proportional to 


f A(0' — 0)1,(0')d6’ . (2.15) 


This integral is equal to the cross-correlation of the antenna reception pattern and 
the intensity distribution of the source. It is convenient to define A(0) = A(—90), 
where A is the mirror image of A with respect to 0. Then expression (2.14) becomes 


AO — 6) (0')d0’ . (2.16) 


source 


The integral in expression (2.15) is an example of the convolution integral; 
see Appendix 2.1, Eq. (A2.33). We can say that the output power of the antenna is 
given by the convolution of the source with the mirror image of the power reception 
pattern of the antenna. The mirror-image’ reception pattern can be described as the 
response of the antenna to a point source. 


3In many cases, the beam is symmetrical, and the mirror image is identical to the beam. 
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In the case of an interferometer, we can express the response as a convolution 
by replacing the antenna power pattern in Eq. (2.16) by the overall power pattern of 
the interferometer. From the results presented earlier, we find that the response of 
an interferometer is determined by three functions: 


¢ The reception pattern of the antennas, which we represent as A(J), 

¢ The fringe pattern, F(/), as in the example of Fig. 2.2 and given by Eq. (2.14). 
Note that the fringe term in the interferometer output, being the product of two 
voltages, is proportional to power. 

e The bandwidth pattern, for example, as given by the sinc-function factor in 
Eq. (2.4). In the general case, we can represent this by F (I). 


Note that the antenna beam is often symmetrical, in which case, if the interferometer 
fringes are aligned with the beam center, we can disregard the distinction between 
the interferometer power pattern and its mirror image in using the convolution 
relationship. 

Next, consider an interferometer with tracking antennas and an instrumental 
delay that is adjusted so the bandwidth pattern also tracks the source across the 
sky. In effect, the intensity distribution is modified by the antenna and bandwidth 
patterns. We can therefore envisage the output of the interferometer as the convolu- 
tion of (the mirror image of) the fringe pattern with the modified intensity. In terms 
of the convolution integral, the response can be written as 


R() = J cos [2ru(l — D] AW) Fa) (dr . (2.17) 


or, more concisely, 
RD) = cos(2xul) * [A()Fg (DI (D] , (2.18) 


where the in-line asterisk symbol (*) denotes convolution. The intensity distribution 
measured with the interferometer is modified by A(/) and Fa(J), but since these 
are measurable instrumental characteristics, J; (/) can generally be recovered from 
the product A(/)Fg(J/, (J). In many cases, the angular size of the source is small 
compared with the antenna beams and the bandwidth pattern, so these two functions 
introduce only a constant in the expression for the response. To simplify the 
discussion, we shall consider this case, and omitting constant factors, we can write 
the essential response of the interferometer as 


R(D) = cos (2xul) * (D . (2.19) 


In the case of the early interferometer shown in Fig. 1.6, in which the antennas are 
fixed in the meridian and do not track the source, the delays in the signal paths 
between the antennas and the point at which the signals are multiplied are equal, 
and there is no variable instrumental delay. Thus, the three functions that determine 
the interferometer power pattern are all fixed with respect to the interferometer 
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baseline. The interferometer power pattern is of the form A(/) cos (27ul) Fz (1), and 
the response of the interferometer to the source is [A(/) cos (27rul) Fg (2) * 7 (D. 

Most interferometers for operation at meter wavelengths, that is, at frequencies 
below about 300 MHz, use antennas that are arrays of fixed dipoles. At such long 
wavelengths, it is possible to obtain large collecting areas and still have wide enough 
beams that some minutes of observing time are obtained as a source passes through 
in sidereal motion. Often the bandwidth of such low-frequency instruments is small, 
so that the bandwidth pattern, Fg(J), is wide and this factor can be omitted. Also, the 
antenna beams are usually wider than the source and sufficiently wide that several 
cycles of the fringe pattern can be measured as the source transits the beam. So 
in the nontracking case, the essential form of the response is also represented by 
Eq. (2.19). However, fixed antennas with nontracking beams are mainly a feature of 
the early years of radio astronomy, and in more recent meter-wavelength arrays, the 
phases of individual dipoles, or small clusters of dipoles, can be adjusted to provide 
steerable beams. 


2.3.2 Convolution Theorem and Spatial Frequency 


We now examine the interferometer response, as given in Eq. (2.19), using the 
convolution theorem of Fourier transforms (see the derivation in Appendix A2.1.2), 
which can be expressed as: 


fxg FG, (2.20) 


where f <—> F, g 4— G, and <—> indicates Fourier transformation. Consider the 
Fourier transforms with respect to / and u of the three functions in Eq. (2.19). For 
the interferometer response, we have r(u) <— R(J). For a particular value u = uo, 
the Fourier transform of the fringe term is given by [see Fourier transform example 
in Eq. (A2.15)] 


cos(2z ul) > 4 [5(u + uo) + 8(u—uo)] , (2.21) 


where ô is the delta function defined in Appendix 2.1. The Fourier transform of J; (J) 
is the visibility function V(u). Thus, from Eqs. (2.19), (2.20), and (2.21), we obtain 


r(u) = 5 [d6(ut+ uo) + 6(u— up) | V (u) 


[V(—up)5(u + uo) + V(uo)d(u — uo)] . (2.22) 


l 
2 
l 
2 
This result shows that the instantaneous output of the interferometer as a function 
of spatial frequency consists of two delta functions situated at plus and minus uo on 
the u axis. Now, V(u), the Fourier transform of 7; (J), represents the amplitude and 
phase of the sinusoidal component of the intensity profile with spatial frequency u 
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cycles per radian. The interferometer acts as a filter that responds only to spatial 
frequencies +uo. The negative spatial frequency —uo has no physical meaning. 
It arises from the use, for mathematical convenience, of the exponential Fourier 
transform rather than the sine and cosine transforms, which correspond more 
directly to the physical situation. As a result, the spatial frequency spectra are 
symmetrical about the origin in the Hermitian sense, that is, with even real parts and 
odd imaginary parts, which is appropriate since the intensity is a real, not complex, 
quantity. 

Fringe visibility, as originally defined by Michelson [Vy, see Eq. (1.9)], is a real 
quantity and is normalized to unity for an unresolved source. Complex visibility 
(Bracewell 1958) was defined to take account of the phase of the visibility, measured 
as the fringe phase, to allow imaging of asymmetric and complicated sources. The 
normalization is convenient when comparing measurements with simple models, 
as shown in Fig. 1.5. However, in images, it is desirable to display the magnitude 
of the intensity or brightness temperature, so the general practice is to retain the 
measured value of visibility, without normalization, since this incorporates the 
required information. Thus, visibility V as used here is an unnormalized complex 
quantity with units of flux density (W m~? Hz7!). The quantity u, which was 
introduced as the projected baseline in wavelengths, is seen also to represent the 
spatial frequency of the Fourier components of the intensity. The concepts of spatial 
frequency and spatial frequency spectra are fundamental to the Fourier synthesis of 
astronomical images, and this general subject is discussed in a seminal paper by 
Bracewell and Roberts (1954). 


2.3.3 Example of One-Dimensional Synthesis 


To illustrate the observing process outlined in this chapter, we present a rudimentary 
simulation of measurements of the complex visibility of a source using arbitrary 
parameters. The source consists of two components separated by 0.34° of angle, 
the flux densities of which are in the ratio 2 : 1. The measurements are made with 
pairs of antennas placed along a line parallel to the direction of separation of the 
two components. Measurements are made for antenna spacings that are integral 
multiples of a unit spacing of 30 wavelengths. All spacings from 1 to 23 times the 
unit spacing are measured. These results could be obtained using two antennas and 
a single correlator, observing the source as it transits the meridian on 23 different 
days and moving the antennas to provide a new spacing each day. Alternately, the 23 
measurements could be made simultaneously using 23 correlators and a number of 
antennas that could be as small as 8 (if they were set out with minimum redundancy 
in the spacings, as discussed in Sect. 5.5). The angular sizes of the two components 
of the source are too small to be resolved by the interferometer, so they can be 
regarded as point radiators. The two components radiate noise, and their two outputs 
are uncorrelated. The source is at a sufficient distance that incoming wavefronts can 
be considered to be plane over the measurement baselines. 
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Figure 2.6a and b show, respectively, the amplitude’ and phase of the visibility 
function as it would be measured. Since the data are derived from a model, there 
are no measurement errors, so the points indicate samples of the Fourier transform 
of the source intensity distribution, which can be represented by two delta functions 
with strengths in the ratio 1 : 2. Taking the inverse transform of the visibility yields 
the synthesized image of the source in Fig. 2.6c. The two components of the source 
are clearly represented. The extraneous oscillations arise from the finite extent of 
the visibility measurements, which are uniformly weighted out to a cutoff at 23 
times the unit spacing. This effect is further illustrated in Fig.2.6d, which shows 
the response of the measurement procedure to a single point source; equivalently, 
it is the synthesized beam. The profile of this response is the sinc function that 
is the Fourier transform of the rectangular window function, which represents the 
cutoff of the measurements at the longest spacing. In the image domain, the double- 
source profile can be viewed as the convolution of the source with the point-source 
response. The point-source nature of the model components maximizes the sidelobe 
oscillations, which would be partially smoothed out if the width of the components 
were comparable to that of the sidelobes. 

As is clear from the convolution relationship, information on the structure of 
the source is contained in the whole response pattern in Fig. 2.6c, that is, in the 
sidelobe oscillations as well as the main-beam peaks. A way to extract the maximum 
information on the source structure would be to fit scaled versions of the response 
in Fig. 2.6d to the two peaks in Fig. 2.6c and then subtract them from the profile. 
In an actual observation, this would leave the noise and any structure that might 
be present in addition to the point sources but would remove all or most of the 
sidelobes. The fitting of the point-source responses could be adjusted to minimize 
some measure of the residual fluctuations, and further components could be fitted 
to any remaining peaks and subtracted. This technique would clearly be a good 
way to estimate the strengths and positions of the two components and to look 
for evidence of any low-level structure that could be hidden by the sidelobes in 
Fig. 2.6c. The CLEAN algorithm, which is discussed in Chap. 11, uses this principle 
but also replaces the components that are removed by model beam responses that 
are free of sidelobes. Removal of the sidelobes allows any lower-level structure to 
be investigated, down to the level of the noise. Most synthesis images are processed 
by nonlinear algorithms of this type, and the range of intensity levels achieved in 
some two-dimensional images exceeds 10° to 1. 


‘Tt is arguable that the modulus of the complex visibility should be referred to as magnitude rather 
than amplitude since the dimensions of visibility include power rather than voltage. However, 
the term visibility amplitude is widely used in radio astronomy, probably resulting from the early 
practice of recording the fringe pattern as a quasi-sinusoidal waveform, and subsequently analyzing 
the amplitude and phase of the oscillations. 
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2.4 Two-Dimensional Synthesis 


Synthesis of an image of a source in two dimensions on the sky requires measure- 
ment of the two-dimensional spatial frequency spectrum in the (u, v) plane, where 
v is the north-south component as shown in Fig. 2.7a. Similarly, it is necessary to 
define a two-dimensional coordinate system (l, m) on the sky. The (l, m) origin is 
the reference position, or phase reference position, introduced in the last section. 
In considering functions in one dimension in the earlier part of this chapter, it was 
possible to define / in Eq. (2.3) as the sine of an angle. In two-dimensional analysis, 
land m are defined as the cosines of the angles between the direction (l, m) and the 
u and v axes, respectively, as shown in Fig. 2.7c. If the angle between the direction 
(l,m) and the w axis is small, / and m can be considered as the components of this 
angle measured in radians in the east-west and north-south directions, respectively. 

For a source near the celestial equator, measuring the visibility as a function of 
u and v requires observing with a two-dimensional array of interferometers, that is, 
an array in which the baselines between pairs of antennas contain components in the 
north-south as well as the east-west directions. Although we have considered only 
east—west baselines, the results derived in terms of angles measured with respect to 
a plane that is normal to the baseline hold for any baseline direction. 

A source at a high declination (near the celestial pole) can be imaged in two 
dimensions with either one- or two-dimensional arrays, as shown in Fig. 1.15 and 
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Fig. 2.7 (a) The (u, v) plane in which the arrow point indicates the spatial frequency, q cycles per 
radian, of one Fourier component of an image of the intensity of a radio source. The components u 
and v of the spatial frequency are measured along axes in the east-west and north-south directions, 
respectively. (b) The (/, m) plane in which a single component of spatial frequency in the intensity 
domain has the form of sinusoidal corrugations on the sky. The figure shows corrugations that 
represent one such component. The diagonal lines indicate the ridges of maximum intensity. The 
dots indicate the positions of these maxima along lines in three directions. In a direction normal to 
the ridges, the frequency of the oscillations is q cycles per radian, and in directions parallel to the u 
and v axes, it is u and v cycles per radian, respectively. (c) The u and v coordinates define a plane, 
and the w coordinate is perpendicular to it. The coordinates (/,m) are used to specify a direction 
on the sky in two dimensions. / and m are defined as the cosines of the angles made with the u and 
v axes, respectively. 
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Fig. 2.8 Illustration of the projection-slice theorem, which explains the relationships between one- 
dimensional projections and cross sections of intensity and visibility functions. One-dimensional 
Fourier transforms are organized horizontally and projections vertically. The symbols F and ?F 
indicate one-dimensional and two-dimensional Fourier transforms, respectively. See the text for 
further explanation. From Bracewell (1956). © CSIRO 1956. Published by CSIRO Publishing, 
Melbourne, Victoria, Australia. Reproduced with permission. 


further explained in Sect. 4.1. As the Earth rotates, the baseline projection on the 
celestial sphere rotates and foreshortens. A plot of the variation of the length and 
direction of the projected baseline as the antennas track the source across the sky is 
an arc of an ellipse in the (u, v) plane. The parameters of the ellipse depend on the 
declination of the source, the length and orientation of the baseline, and the latitude 
of the center of the baseline. In the design of a synthesis array, the relative positions 
of the antennas are chosen to provide a distribution of measurements in u and v 
consistent with the angular resolution, field of view, declination range, and sidelobe 
level required, as discussed in Chap. 5. The two-dimensional intensity distribution 
is then obtained by taking a two-dimensional Fourier transform of the observed 
visibility, V(u, v). 


2.4.1 Projection-Slice Theorem 


Some important relationships between one-dimensional and two-dimensional func- 
tions of intensity and visibility are summarized in Fig. 2.8, which illustrates the 
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projection-slice theorem of Fourier transforms (Bracewell 1956, 1995, 2000). At the 
top left is the two-dimensional intensity distribution of a source I(l, m), and at the 
bottom right is the corresponding visibility function V (u, v). These two functions 
are related by a two-dimensional Fourier transform, as indicated on the arrows 
shown between them. Note the general property of Fourier transforms that the width 
in one domain is inversely related to the width in the other domain. At the lower 
left is the projection of I(l, m) on the / axis, which is equal to the one-dimensional 
intensity distribution J; (J). This projection is obtained by line integration along lines 
parallel to the m axis, as defined in Eq. (1.10). Z; is related by a one-dimensional 
Fourier transform to the visibility measured along the u axis at the lower right, that 
is, the profile of a slice V(u, 0) through the visibility function V(u, v), indicated 
by the shaded area in the diagram. V(u,0) could be measured, for example, by 
observations of a source made at meridian transit with a series of interferometer 
baselines in an east-west direction. This relationship was encountered in Chap. 1 
in the description of the Michelson interferometer, and examples of such pairs of 
functions are shown in Fig. 1.5. At the upper right is a projection of V(u, v) on 
the u axis, Vı (u) = f V(u, v)dv, and this is related by a one-dimensional Fourier 
transform to a slice profile of the source intensity /(/, 0) along the / axis at the upper 
left, indicated by the shaded area. The relationships between the projections and 
slices are not confined to the u and / axes but apply to any sets of axes that are 
parallel in the two domains. For example, integration of I(l, m) along lines parallel 
to OP results in a curve, the Fourier transform of which is the profile of a slice 
through V(u, v) along the line QR. 

The relationships in Fig.2.8 apply to Fourier transforms in general, and their 
application to radio astronomy was recognized during the early development of the 
subject. For example, in determining the two-dimensional intensity of a source from 
a series of fan-beam scans at different angles, one can perform one-dimensional 
transforms of the scans to obtain values of V along a series of lines through the 
origin of the (u,v) plane, thus obtaining the two-dimensional visibility V(u, v). 
Then, /(/,m) can be obtained by two-dimensional Fourier transformation. In the 
early years of radio astronomy, before computers were widely available, such 
computation was a very laborious task, and various alternative procedures for 
image formation from fan-beam scans were devised (Bracewell 1956; Bracewell 
and Riddle 1967). 

As this introductory chapter has shown, much of the theory of interferometry 
is concerned with data in two forms or domains. Within the literature, there is 
some variation in the associated terminology. The observations provide data in 
the visibility domain, also variously referred to as the spatial frequency, (u, v), or 
correlation domain. The astronomical results are shown in the image domain, also 
variously referred to as the brightness, intensity, sky, or map domain. “Map” was 
appropriate in earlier years when the image was sometimes in the form of contours 
of intensity. 
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2.4.2 Three-Dimensional Imaging 


Three-dimensional images can be made of objects that are optically thin and 
rotating. An image taken at a particular time is the projected image along the line 
of sight. A series of images taken at different projection angles can be combined to 
obtained an estimate of the three-dimensional distribution of emitters in the source. 
This can be done in a straightforward fashion by use of the three-dimensional 
generalization of the projection-slice theorem, described in Sect. 2.4.1, to build up a 
three-dimensional visibility function. Such a technique was developed and first used 
to image the radiation belts of Jupiter by Sault et al. (1997). A somewhat different 
tomographic technique was developed by de Pater et al. (1997). The techniques 
were compared by de Pater and Sault (1998). These techniques might be applicable 
to extended stellar atmospheres observed with VLBI arrays. 


Appendix 2.1 A Practical Fourier Transform Primer 


This appendix is intended to provide a brief introduction to the principles of Fourier 

transform theory most relevant to radio interferometry. For more comprehensive 

treatment, see Bracewell (1995, 2000), Champeney (1973), and Papoulis (1962). 
The Fourier transform of a function f(x) can be written as 


[oe 
F(s) = i fae dx. (A2.1) 
—oo 
The inverse transform is 
& s 
f(x) = f F(s) e?™ ds . (A2.2) 
—oo 
The transform pair is written symbolically as 
f(x) << F(s). (A2.3) 


If x has units of meters, then s has units of cycles/meter; if x has units of time, then s 
has units of cycles/second, i.e., hertz. The Fourier transform pair can also be written 
in the form normally used in the time-frequency domains as 


F(@) = f ~ fA edt, (A2.4) 


f= ak f i F(a) edo . (A2.5) 
20 Joo 
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In this case, the frequency is an angular frequency in radians/sec. We use the 
formulation in Eqs. (A2.1) and (A2.2) for three reasons: It is widely used in image 
analysis, it allows for easier tracking of 27 factors, and it provides a more natural 
segue to the discussion of the discrete Fourier transform (see Appendix 8.4). 

We can check that f(x) can be recovered from F(s) by the substitution of 
Eq. (A2.1) into Eq. (A2.2), 


fe) = f i | i FO) crea oP ds , (A2.6) 


where we switched the variable x to x’ to allow us to interchange the order of 
integration, thereby obtaining 


f(x) = f í fœ) | | j crests] dx . (A2.7) 


The integral in brackets can be evaluated by a limit process, i.e., 


we Ps so J 
oe TPT Dds = lim eP TE ds 
—oo s0 —> 00 —so 


= lim 259 
SQ—>00 


ee | (A2.8) 


27s (x’ — x) 


The function in the brackets is a sinc function (see Fig. A2.1) centered at x’ = x, 
having a width between first nulls of 2/sọ and an integral, which happens to equal 
the area of the triangle formed by the peak and the first nulls, of unity. The limit of 
this function can be used as a definition of the Dirac delta function (often called the 
impulse function in much of engineering literature), 


sin 2 = 
je tn 5g | ee | (A2.9) 
S900 2750 (x' — x) 
which is undefined at x’ = x and has the properties 
ôx —x) =0, x Ax (A2.10a) 
and 
[0,0] 
f ôx —x)dx =1. (A2.10b) 
—0o 


Substitution of Eqs. (A2.9) and (A2.8) into Eq. (A2.7) gives 


f&) = [7 ôx — x) dx’. (A2.11) 
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Fig. A2.1 The sinc function in Eq. (A2.9), whose limiting form is a delta function, d(x’ — x). 


Since 5(x’ — x) is nonzero only at x = x, it is clear from Eq. (A2.10b) that we 
can factor f(x) out of the integral in Eq.(A2.11), which gives the desired result, 
f(x) = f(x), and proves that f(x) can be recovered from its transform, F(s). 
Equation (A2.11) is called the sifting property of (x). 


A2.1.1 Useful Fourier Transform Pairs 


We mention five Fourier transform pairs of particular interest to readers of this book. 
The first pair is 


Xo 
=f; <—, 
f(x) hls 
=0, otherwise, (A2.12a) 
F(s) = x9 = xpsinc(sx) . (A2.12b) 
ITSX0O 


f(x) is called a boxcar or unit rectangular function and denoted as [](x). 
The second Fourier transform is of a Gaussian function 


fœ) = em ; (A2.13a) 


2 


F(s) = Vrae tts, (A2.13b) 
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F(s) can be calculated by a procedure called “completing the square”: 


29 x s 
F(s) = / e 2 @ PRX dy (A2.14) 


CO 


The term in the exponent is (x? + j4zra*sx)/2a? = [(x —j2ma?s)? + 427a*s?]/2a’. 
The term involving 427a‘s? can be factored out of the integral, which leads to 
Eq. (A2.13b). 


The third useful Fourier transform pair is 


f(x) = cos 2r5sox , (A2.15a) 
F(s) = ; [5(s — so) + 6(s + so)] . (A2.15b) 


F(s) is calculated by writing f(x) in terms of exponentials and by use of the same 
limiting process used in deriving Eq. (A2.9). 

The fourth Fourier transform pair is for an infinite train of delta functions, which 
is also an infinite train of delta functions, i.e., 


Yie- > F s(s-=) . (A2.16) 


k=—o0 m=—oo Xo 


This relation can be proved by starting with a finite train of impulses and applying 
the shift property [Eq. (A2.22)]. The Fourier transform is an infinite series of sinc 
functions at intervals of xp ' Then, by the same process used in Eq. (A2.9), the sinc 
functions become Dirac delta functions in the limit as k —> oo. 

The fifth Fourier transform pair is for the Heaviside step function 


f(x) =1, x20, 
f(x) =0, x<0, (A2.17a) 
1 1 
F(s) = =d(s) + —. (A2.17b) 
2 j2rs 
The calculation of F(s) requires some care. Decompose f(x) into fe(x) = 5 and 


fo(x) = 5sgn(x) = i for x > 0 and -4 for x < 0. The Fourier transform of f(x) is 
F(s) = 56(s). We replace f,(x) with the functions se, x > 0, and se", x <0, 
and evaluate F,(s) in a limit as a > 0. Hence 


0 ; oo 
F,(s) = lim -f e™ eiT dy + | eseas] 
a> 0 


=0o 
j2ms 1 


= lim -——_—_~ = —. A2.18 
a>) a + (27s)? 2mjs ( ) 


Combining these results gives F(s) = F.(s) + F,(s), which proves Eq. (A2.17b). 
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A2.1.2 Basic Fourier Transform Properties 


We list several important properties that are readily provable. 


¢ Integral property 
[o,) 
F(0) = f f(x) dx, (A2.19a) 
—oo 


f(0) = Í i F(s)ds. (A2.19b) 


The application of Eq.(A2.19) to example five above [Eq.(A2.17)] gives the 
interesting result that f(0) = 5 [see Bracewell (2000) for a discussion of this 
point]. 


° Linearity property. If f(x) and g(x) have transforms F(s) and G(s), then 
af (x) > aF(s) , (A2.20) 
and 
f(x) + g(x) —> F(s) + Gs) . (A2.21) 


Equation (A2.21) is fundamental and particularly useful. In terms of interferom- 
etry, it means that the visibility function is the sum of the visibility functions of 
all the components in the image. 


° Shift property 
f(x —x0) <> e 77 F(s) , (A2.22a) 
and 


F(s — so) <—> e?™ F(x) . (A2.22b) 


e Modulation property. From the shift property, it follows that 
f(x) cos sox <—> $ [F(s — so) + F(s + so)] . (A2.23) 


e Similarity property 


Ans oe (-) (A2.24) 
|a| 


This important relation shows that if a function f(x) narrows, then F(s) broadens 
proportionally and vice versa, so that the product of the widths of functions in 
the x and s domains, Ax and As, respectively, satisfies the relation 


AxAs~1. (A2.25) 
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This result is the basis of the uncertainty principle in quantum mechanics, a wave 
theory. It is called the time-bandwidth product in signal-processing applications 
and the ambiguity function in radar astronomy. If Ax and As are defined as the 
full width at half-maximum (FWHM), then for the boxcar-sinc function pair 
[Eq. (A2.12)], AxAs = 1.21, and for the Gaussian function pair [Eq. (A2.13)], 
AxAs = 4ln2/x = 0.88. 


e Derivative property 


ay <— (j2m5)"F(s) , (A2.26) 
dx" 
and 
< (—j2mx)"f (x) . (A2.27) 
Ky 


e Symmetry properties. Symmetry properties are very useful in calculating and 
visualizing Fourier transforms. Any function can be divided into even and odd 
components, f(x) and f,(x), respectively, which are defined as 

fox) = 3 [1f@) +C] . (A2.28a) 
folx) = 5 [f@) —f(-»)] . (A2.28b) 
Hence, if f(x) is real and even, then F(s) is also real and even. If f(x) is real and 


odd, then F(s) is imaginary and odd. The Fourier transform pair in Eq. (A2.17) 
is a nice example of these symmetry properties. 


e Moment property. The moments of f(x) are 


lo) 
My, = / x"f (x) dx . (A2.29) 
—oo 
Hence, from the derivative and the integral properties, 
d"F(0) kil 
ie <> (-j2r)"m, . (A2.30) 
s 


If these moments exist, then the Taylor expansion of F(s) is 
[oe s, 
(Gj2r)" a 
F(s) => ons" (A2.31) 
n=0 


Hence, if f(x) is an even function and its moments exist, the lead terms of F(s) 
are 


F(s) = mo — 227m? . (A2.32) 
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Convolution property. The convolution of two functions, f(x) and g(x), which 
have Fourier transforms F(s) and G(s), respectively, is defined as 


[0.0] 
ny) = f fede -9 ar, (A2.33) 
—0oo 
which can be written with the convolution operator, *, as 


hO) = f(y) * 80) (A2.34) 


Note that f x g = g x f. The convolution property is 


fO) * g0) <> F(s)G(s) . (A2.35) 


This property can be demonstrated as follows. The Fourier transform of h(y) is 


CO Co 
H(s) = f l i fœ- x) | e PRY dy , (A2.36) 
—0oo —oo 
or, interchanging the order of integration, 
CO lo, , 
H(s) = J f(x) f gx edy | dx. (A2.37) 
—oo —oo 
We make the variable substitution, z = y — x, to obtain 
Co CO f , 
H(s) = f f(x) J g) e Pdz | eP dx. (A2.38) 
—00 =00 


The term in brackets is G(s), which can be factored out of the remaining integral, 
which is F(s), so 


H(s) = F(s) G(s) . (A2.39) 


Hence, the Fourier transform of the convolution of two functions is the product 
of their Fourier transforms. This relationship, known as the convolution theorem, 
is shown diagrammatically in Fig. A2.2. It follows that the convolution of two 
functions in the frequency domain corresponds to multiplication in the time 
domain. 


Correlation property. The correlation function is defined as 


ry) = f fios- ar; (A2.40) 
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convolution 


Fig. A2.2 Relationships involving Fourier transforms and convolution. As elsewhere in this book, 
the in-line asterisk indicates convolution. 


which can be written with the correlation operator, x, as 
rO) = f(x) * g(x) . (A2.41) 
The correlation property is 
f(x) x g(x) <> F(s)G*(s) . (A2.42) 


The Fourier transform of Eq. (A2.40) is 


R(s) = J i | / "en-A d e Pa dy , (A2.43) 


Interchanging the order of integration and making the substitution z = x—y gives 


CO CO , , 
R(s) = J f(x) ll g(z) ere] edy, (A2.44) 
—0o —0oo 
which results in 
R(s) = F(s) G* (s) . (A2.45) 
This relationship is shown in Fig. 8.1. An example where f(x) = g(x) = boxcar 
is shown in Fig. A2.3. Since f(x) is an even function, convolution and correlation 


are the same, both producing even functions. Hence, F(s) is real and even, and 
F(s)F(s) = F(s)F*(s). 
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f(x) F(s) 


Fig. A2.3 Example of the correlation and convolution theorems for an even function f(x). The 
vertical arrow on the left indicates f * f for the case of convolution and f » f for correlation. The 
vertical arrow on the right indicates F(s)F(s) for convolution and F(s)F* (s) for correlation. 


e Parseval’s theorem. The relationship 


Í k Iœ dx = / i |F(s)|?ds (A2.46) 


foe) 


is known generally as Parseval’s theorem.’ To prove it, we write 


Í KO dx = Í i f i Feeds [ tien as 


(A2.47) 


or 


f fowrorar= ff rere] fener |avad. 


(A2.48) 
The integral in brackets is 6(s — s’), so that 
f FOS” (x) dx = f F(s)F*(s) ds . (A2.49) 


>Parseval’s theorem originally applied to Fourier series (see Appendix A2.1.4). Rayleigh gener- 
alized it for application to Fourier transforms. Mathematicians often refer to it as Plancherel’s 
theorem. As is common practice, we use only the name “Parseval’s theorem” in this book. 
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A useful theorem in interferometry is the projection—slice theorem, which is 
proved in Sect. 2.4.1. 

A2.1.3 Two-Dimensional Fourier Transform 


The two-dimensional Fourier transform between f(x, y) and F(u, v) can be written 


CO oO , 
F(u, v) = f / f(x,y) eru) dy dy : 
nae (A2.50) 


[0,6] [0,6] . 
f(x,y) = f f F(u, v) e?™"*) dudo . 
=p = 


If x and y are in radians, then u and v are in units of cycles/radian. We write 
symbolically 


f(x, y) <> Flu, v). (A2.51) 


All of the properties in Appendix A2.1.2 have analogs in the two-dimensional 
Fourier transform. For example, the shift theorem is 


f(x — xo, y — yo) <> e= two) F(u, v). (A2.52) 


The two-dimensional Fourier transform can be converted to polar coordinates by 
defining x = r cos 0, y = r sin 0, u = qcos@, and v = q sin ġ, which leads to 


2r [oe] 
F(q, $) = / f fr, 0) eOr dradh : (A2.53) 
o Jo 
If f(r, 0) = f(r), i.e., f is azimuthally symmetric, then 
(a) 20 , 
F(q, $) = / f(r) rdr / e PrIO-9) G6 | (A2.54) 
0 0 
Since the zeroth-order Bessel function is defined as 
1 27 , 
Jo(z) = — f e cos 6g , (A2.55) 
27 0 


F(q, $) = F(q) and 


F(q) = 27 f fersucenaryrar : (A2.56a) 
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By symmetry, 
[o,e] 
fr) = 2x f F(q)Jo(2mqr)q dq . (A2.56b) 
0 


Equations (A2.56a) and (A2.56b) are called the Hankel transform pair. 


A2.1.4 Fourier Series 


The Fourier series is a special case of the Fourier transform. A periodic function 
f(x), which repeats over the interval —xo/2,xo/2, has the complex Fourier series 
representation 


© ake 
f=] aen, (A2.57) 
—0o 
where 
= _ Pake 
a= | f@e d. (A2.58) 


If we define f(x) as f(x) over the interval —x9/2, xo/2, then its Fourier transform, 
F(s), is given by 


F(s) = È Fo(kso) 8(s — kso) , (A2.59) 
k=0 


where so = 1/xo and Fo(kso) = a. This is called a line spectrum: F(s) consists 
of delta functions spaced at intervals s = 1/xo with amplitudes corresponding to 
the Fourier coefficients. Parseval’s theorem for the Fourier series can be found by 
substituting Eqs. (A2.57) and (A2.59) into Eq. (A2.49), yielding 


Soap = Í : fO (x) dx. (A2.60) 


A2.1.5 Truncated Functions 


The Fourier transform theory described above can be applied to functions that 
are random processes. If an ergodic random process has an associated temporal 
function f(x), that function generally extends to infinity, and f | f(x)|? = 00, which 
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presents certain theoretical difficulties. These difficulties are mitigated by choosing 
a truncated version of the function 


fre) = f(x) (x/x0) , (A2.61) 


where I(x) is the boxcar function defined after Eq. (A2.12). By the convolution 
property [Eq. (A2.35)], 


Fr(s) = F(s) * sinc(sxo) . (A2.62) 


Truncation has the effect of smoothing, or limiting the resolution of, F(s). 
The power spectrum of a truncated function is usually defined as 


Pr(s) = LFO F*(s), (A2.63) 


which has units of power and does not depend on T. Note that the Fourier 
transform as defined for deterministic functions in previous sections is actually an 
energy density spectrum. The conditions under which the Fourier transform of an 
autocorrelation function and its power spectrum exist for random processes were 
first explored and clarified by Wiener and Khinchin. Hence, the Fourier transform 
between the autocorrelation function of a random process and its power spectrum is 
formally called the Wiener—Khinchin theorem (or relation). 
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Chapter 3 
Analysis of the Interferometer Response 


In this chapter, we introduce the full two-dimensional analysis of the interferometer 
response, without small-angle assumptions, and then investigate the small-field 
approximations that simplify the transformation from the measured visibility to the 
intensity distribution. There is a discussion of the relationship between the cross- 
correlation of the received signals and the cross power spectrum, which results from 
the Wiener—Khinchin relation and is fundamental to spectral line interferometry. An 
analysis of the basic response of the receiving system is also given. The appendix 
considers some approaches to the representation of noiselike signals, including the 
analytic signal, and truncation of the range of integration. 


3.1 Fourier Transform Relationship Between Intensity 
and Visibility 


3.1.1 General Case 


We begin by deriving the relationship between intensity and visibility in a 
coordinate-free form and then show how the choice of a coordinate system results 
in an expression in the familiar form of the Fourier transform. Suppose that the 
antennas track the source under observation, which is the most common situation, 
and let the unit vector So in Fig. 3.1 indicate the phase reference position introduced 
in Sect. 2.3. This position, sometimes also known as the phase-tracking center, 
becomes the center of the field to be imaged. For one polarization, an element of 
the source of solid angle d{2 at position s = sọ + ø contributes a component of 
power 5A(o)I (0) Avd&?2 at each of the two antennas. Here, A(o) is the effective 
collecting area of each antenna, /(o) is the source intensity distribution as observed 
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Fig. 3.1 Baseline and Source 
position vectors that specify 
the interferometer and the 
source. The source is 
represented by the outline on 


the celestial sphere. 


from the distance of the antennas, and Av is the bandwidth of the receiving system. 
It is easily seen that this expression has the dimensions of power since the units 
of I are W m~? Hz! sr~!. From the considerations outlined in the derivation of 
Eqs. (2.1) and (2.2), including the far-field condition for the source, the resulting 
component of the correlator output is proportional to the received power and to the 
fringe term cos(2vt,), where Tg is the geometric delay. The vector D; will specify 
the baseline measured in wavelengths, and then vt = D1- s = Dj - (so + o ). Thus, 
the output from the correlator is represented by 


r(Dy, So) = Av f A(a)I(a) cos [27 D; + (So + 0o)] d2 


An 


= Av cos(2x D; -so) | A(o)I(o) cos(27D, +0) dQ 
An 


— Av sin(2x D; « So) 1 A(o)I(o) sin(2x7D,+-0)d2 . (3.1) 
An 


Note that the integration of the response to the element d2 over the source in 
Eq. (3.1) requires the assumption that the source is spatially incoherent, that is, 
that the radiated waveforms from different elements dS2 are uncorrelated. This 
assumption is justified for essentially all cosmic radio sources. Spatial coherence is 
discussed further in Sect. 15.2. Let Ag be the antenna collecting area in direction 
So in which the beam is pointed. We introduce a normalized reception pattern 
Ay(o) = A(o)/Ao and consider the modified intensity distribution Ay (0 )I (0). Now 
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we define the complex visibility! as 


V =|Vle* = f Ay(o)I(o) e P dR . (3.2) 


4r 


Then by separating the real and imaginary parts, we obtain 
| Ay(o)I(o) cos(2xD, +0) d2 = |V| cos dy , (3.3) 
4r 
| Ay(o)I(o) sin(27D,-0)d2 = -|V| sing , (3.4) 
4r 


and from Eq. (3.1) 
r(Dy, so) = ApAv|V| cos(2xD; «So — dy) . (3.5) 


Thus, the output of the correlator can be expressed in terms of a fringe pattern 
corresponding to that for a hypothetical point source in the direction Sọ, which is 
the phase reference position. As noted earlier, this is usually the center or nominal 
position of the source to be imaged. The modulus and phase of V are equal to the 
amplitude and phase of the fringes; the phase is measured relative to the fringe phase 
for the hypothetical source. As defined above, V has the dimensions of flux density 
(W m>? Hz"! ), which is consistent with its Fourier transform relationship with J. 
Some authors have defined visibility as a normalized, dimensionless quantity, in 
which case it is necessary to reintroduce the intensity scale in the resulting image. 
Note that the bandwidth has been assumed to be small compared with the center 
frequency in deriving Eq. (3.5). 

In introducing a coordinate system, the geometry we now consider is illustrated 
in Fig.3.2. The two antennas track the center of the field to be imaged. They 
are assumed to be identical, but if they differ, Ay(o) is the geometric mean of 
the beam patterns of the two antennas. The magnitude of the baseline vector is 
measured in wavelengths at the center frequency of the observing band, and the 
baseline has components (u, v, w) in a right-handed coordinate system, where u and 
v are measured in a plane normal to the direction of the phase reference position. 
The spacing component v is measured toward the north as defined by the plane 
through the origin, the source, and the pole, and u is measured toward the east. 


‘In formulating the fundamental Fourier transform relationship in synthesis imaging, which 
follows from Eq. (3.2), we use the negative exponent to derive the complex visibility function 
(or mutual coherence function) from the intensity distribution, and the positive exponent for the 
inverse operation. From a physical viewpoint, the choice is purely arbitrary, and the literature 
contains examples of both this and the reverse convention. Our choice follows Born and Wolf 
(1999) and Bracewell (1958). 
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Fig. 3.2 Geometric 
relationship between a source 
under observation I(l, m) and 
an interferometer or one 
antenna pair of an array. The 
antenna baseline vector, 
measured in wavelengths, has 
length D} and components 
(u, v, w). 


The component w is measured in the direction sọ, which is the phase reference 
position. On Fourier transformation, the phase reference position becomes the origin 
of the derived intensity distribution I(l, m), where l and m are direction cosines 
measured with respect to the axes u and v. In terms of these coordinates, we find 


D; -So = w 
Dı s= (u+ vm + wT =P =n) 
dldi 
Hoe Ee (3.6) 


VISP- m 


3.1 Fourier Transform Relationship Between Intensity and Visibility 93 


where V1 — P — m? is equal to the third direction cosine n measured with respect 
to the w axis.” Note also that D,-o = D,-s—Dy,-5o. Thus, from Eq. (3.2): 


Veni f i f Anll m)I(, m) 


x exp f-j2x [w+ vm + w (VI—P =n — 1)]} SS : 


(3.7) 


A factor e/?*” on the right side in Eq. (3.7) results from the measurement of angular 
position with respect to the w axis. For a source on the w axis, / = m = 0, and 
the argument of the exponential term in Eq. (3.7) is zero. For any other source, the 
fringe phase is measured relative to that for a source on the w axis, which is the 
phase reference position, sọ in Fig.3.2. The function Aj/J in Eq. (3.7) is zero for 
P +m? > 1, and in practice, it usually falls to very low values for directions outside 
the field to be imaged, as a result of the antenna beam pattern, the bandwidth pattern, 
or the finite size of the source. Thus, we can extend the limits of integration to +00. 
Note, however, that Eq. (3.7) requires no small-angle assumptions. The reason why 
we use direction cosines rather than a linear measure of angle in interferometer 
theory is that they occur in the exponential term of this relationship. 

The coordinate system (/,m) defined above is a convenient one in which to 
present an intensity distribution. It corresponds to the projection of the celestial 
sphere onto a plane that is a tangent at the field center, as shown in Fig. 3.3. The 
distance of any point in the image from the (l, m) origin is proportional to the sine 
of the corresponding angle on the sky, so for small fields, distances on the image 
are closely proportional to the corresponding angles. The same relationship usually 
applies to the field of an optical telescope. For a detailed discussion of relationships 
on the celestial sphere and tangent planes, see König (1962). 

If all the measurements could be made with the antennas in a plane normal to 
the w direction so that w = 0, Eq. (3.7) would reduce to an exact two-dimensional 
Fourier transform. In general, this is not possible, and we now consider ways in 
which the transform relationship can be applied. Recall first that the basis of the 


?The expression for d2 is obtained by considering the unit sphere centered on the (u, v, w) 
origin. A point P on the sphere with coordinates (u,v, w) is projected onto the (u, v) plane at 
u = l,v = m, and the increments dl,dm define a column of square cross section running 
through (u,v, 0) parallel to the w axis. The column makes an angle cos! n with the normal to 
the spherical surface at P, and d2 is equal to the surface area intersected by the column, which is 
dl dm/n, or dldm/ V1 — I” — m2. Alternately, the solid angle can be expressed in polar coordinates 
as dQ = sin 0 d0 dg, where 6 and ¢ are the polar and azimuthal angles in the (u, v, w) plane, that 
is, 0 = sin”! yP + m? and @ = tan! m/l. Calculation of the Jacobian of the transformation 
from (6, @) coordinates to (L, m) coordinates gives the result d2 = dl dm// 1 — 2 — m? (Apostol 
1962). 
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Fig. 3.3 Mapping of the 
celestial sphere onto an image 
plane, shown in one 
dimension. The position of 
the point P is measured in 
terms of the direction cosine 
m with respect to the v axis. 
When projected onto a plane 
surface with a scale linear in 
m, P appears at P’ at a 
distance from the field center 
C proportional to sin y. 


synthesis imaging process is the measurement of V over a wide range of u and v. For 
a ground-based array, this can be achieved by varying the length and direction of the 
antenna spacing and also by tracking the field-center position as the Earth rotates. 
The rotation causes the projection of D, to move across the (u,v) plane, and an 
observation may last for 6-12 h. As the Earth’s rotation carries the antennas through 
space, the baseline vector remains in a plane only if D} has no component parallel 
to the rotation axis, that is, the baseline is an east—west line on the Earth’s surface. 
In the general case, there is a three-dimensional distribution of the measurements of 
V. The simplest form of the transform relationship that can then be used is based 
on an approximation that is valid so long as the synthesized field is not too large. If 
l and m are small enough that the term 


(VOP n- 1) w= -P + mw (3.8) 
can be neglected, Eq. (3.7) becomes 


Ayn (L, m)I(l, m) 
Vice am 


Thus, for a restricted range of / and m, V (u, v, w) is approximately independent of 
w, and for the inverse transform, we can write 


[0,6] [0,6] g 
Vu v,w) = Vav = f / e Prultum) didm. (3.9) 
—oo J — 00 


An(l, m)I(1, a fac ; ) 
—l—m —oo J—0o 


With this approximation, it is usual to omit the w dependence and write the visibility 
as the two-dimensional function V(u, v). Note that the factor /1—/ —m? in 
Eqs. (3.9) and (3.10) can be subsumed into the function Ay(/, m). Equation (3.10) 
is a form of the van Cittert-Zernike theorem, which originated in optics and is 
discussed in Sect. 15.1.1. 

The approximation in Eq. (3.9) introduces a phase error equal to 27 times the 
neglected term, that is, n(P + m)w. Limitation of this error to some tolerable value 
places a restriction on the size of the synthesized field, which can be estimated 
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Fig. 3.4 When observations 
are made at a low angle of 


elevation and at an azimuth 

close to that of the baseline, 

the spacing component w 2 
becomes comparable to the 

baseline length D}, which is 


measured in wavelengths. 


approximately as follows. If the antennas track the source under observation down to 
low elevation angles, the values of w can approach the maximum spacings (Dj) max 
in the array, as shown in Fig.3.4. Also, if the spatial frequencies measured are 
evenly distributed out to the maximum spacing, the synthesized beamwidth 6, is 
approximately equal to (D,)—!. Thus, the maximum phase error is approximately 


8 2 
a (2) or, (3.11) 


where 6} is the width of the synthesized field. The condition that no phase errors can 
exceed, say, 0.1 rad then requires that 


 <tV6,, (3.12) 


where the angles are measured in radians. For example, if 6, = 1”, 0; < 2.5 arcmin. 
Much synthesis imaging in astronomy is performed within this restriction, and ways 
of imaging larger fields will be discussed later. 


3.1.2 East-West Linear Arrays 


We now turn to the case of arrays with east—west spacings only and discuss further 
the conditions for which we can put w = 0, and the resulting effects. Let us first 
rotate the (u, v, w) coordinate system about the u axis until the w axis points toward 
the pole, as shown in Fig. 3.5. We indicate by primes the quantities measured in the 
rotated system. The (u’, v’) axes lie in a plane parallel to the Earth’s equator. The 
east-west antenna spacings contain components in this plane only (i.e., w = 0), 
and as the Earth rotates, the spacing vectors sweep out circles concentric with the 
(u', v’) origin. From Eq. (3.7), we can write 


dl'dm’ 


Jeme A 


CO [0.6] 
Vu, v’) = J | An(, ml, m’) e Paull +m’) 
=00 J—CO 
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Antenna spacing 


loci Equatorial 


plane 


Fig. 3.5 The (u’,v’,w’) coordinate system for an east-west array. The (u’,v’) plane is the 
equatorial plane and the antenna spacing vectors trace out arcs of concentric circles as the Earth 
rotates. Note that the directions of the u’ and v’ axes are chosen so that the v’ axis lies in the 
plane containing the pole, the observer, and the point under observation (ao, 50). In Fourier 
transformation from the (u’, v’) to the (l’,m’) planes, the celestial hemisphere is imaged as a 
projection onto the tangent plane at the pole. The (u,v, w) coordinates for observation in the 
direction (œo, 59) are also shown. 


where (l, m’) are direction cosines measured with respect to (u’, v’). Equation (3.13) 
holds for the whole hemisphere above the equatorial plane. The inverse transforma- 
tion yields 


Ay(I,m')I(U, m’) 
/1 = [2 = m2 


In this imaging, the hemisphere is projected onto the tangent plane at the pole, as 
shown in Fig. 3.5. In practice, however, an image may be confined to a small area 
within the antenna beams. In the vicinity of such an area, centered at right ascension 
and declination (œo, ôo), angular distances in the image are compressed by a factor 
sin 59 in the m’ dimension. Also, in imaging the (œo, ôo) vicinity, it is convenient if 
the origin of the angular position variables is shifted to (œo, 69). Expansion of the 
scale and shift of the origin can be accomplished by the coordinate transformation 


ee Ro 5 Jy CP i 
= J f Yu, v) eP Hm dy! dy! . (3.14) 
=0 —00 


l=!, m” = (m — cos ĝo) cosec ĝo . (3.15) 
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If we write F(/’, m’) for the left side of Eq. (3.14), then 
F(U, m) —> Viiv’), (3.16) 
and 
F |l, (m — cos ŝo) cosec 59] <> |sin o| V (u', v’ sin Boje 7” sbo (3.17) 


where <—> indicates Fourier transformation. Equation (3.17) follows from the 
behavior of Fourier pairs with change of variable and involves the shift and 
similarity properties of Fourier transforms (see Appendix 2.1). The coordinates 
(u, v’ sin £o) on the right side of Eq. (3.17) represent the projection of the equatorial 
plane onto the (u, v) plane, which is normal to the direction (œo, ôo). In the (u, v, w) 
system, u = uw’ and v = v’sindo. The coordinate w shown in Fig. 3.5 is equal to 
—v’ cos do. Thus, e Px" cos 50 in Eq. (3.17) is the same factor that occurs in Eq. (3.7) 
as a result of the measurement of visibility phase relative to that for a point source 
in the w direction. Equation (3.14) now becomes 


Ax (l, m”)I(l, m”) 


j 1 
V(u', v' sin 89) |sin 8o] e 77" s80 
1] — B — m’? S S 


x eP +m dyl dy! 
[0,6] co : ii 

= f / V(u, v) ePm) du dv . (3.18) 
—oo = 


A similar analysis is given by Brouw (1971). 

The derivation of Eq. (3.18) from Eq. (3.14) involves a redefinition of the m 
coordinate but no approximations. Equation (3.18) is of the same form as Eq. (3.10), 
in which the term in Eq. (3.8) was neglected. Thus, if we apply the imaging scheme 
of Eq. (3.10), which is based on omitting this term, to observations made with 
an east—west array, the phase errors introduced distort the image in a way that 
corresponds exactly to the change of definition of the m variable to m”. Since m” is 
derived from a direction cosine measured from the v’ axis in the equatorial plane, 
there is a progressive change in the north-south angular scale over the image. The 
factor cosec 59 in Eq. (3.15) establishes the correct angular scale at the center of 
the image, but this simple correction is acceptable only for small fields. The crucial 
point to note here is that when visibility data measured in a plane are projected into 
(u,v, w) coordinates, w is a linear function of u and v (and a linear function of v 
alone for east-west baselines). Hence, the phase error (Ê + m)w is linear in u 
and v. Phase errors of this kind have the effect of introducing position shifts in the 
resulting image, but there remains a one-to-one correspondence between points in 
the image and on the sky. The effect is simply to produce a predictable, and hence 
correctable, distortion of the coordinates. 
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It is clear from Fig. 3.5 that if all the measurements lie in the (u’, v’) plane, 
then the values of v in the (u,v) plane become seriously foreshortened for 
directions close to the celestial equator. Obtaining two-dimensional resolution in 
such directions requires components of antenna spacing parallel to the Earth’s axis. 
The design of such arrays is discussed in Chap. 5. The effect of the Earth’s rotation 
is then to distribute the measurements in (u,v,w) space so that they no longer 
lie in a plane, unless the observation is of short time duration. In some cases, the 
restriction of the synthesized field in Eq. (3.12) is acceptable. In other cases, it 
may be necessary to image the entire beam to avoid source confusion, and several 
techniques are possible based on the following approaches: 


1. Equation (3.7) can be written in the form of a three-dimensional Fourier 
transform. The resulting intensity distribution is then taken from the surface of a 
unit sphere in (l, m, n) space. 

2. Large images can be constructed as mosaics of smaller ones that individually 
comply with the field restriction for two-dimensional transformation. The centers 
of the individual images must be taken at tangent points on the same unit sphere 
referred to in 1. 

3. Since in most terrestrial arrays the antennas are mounted on an approximately 
plane area of ground, measurements taken over a short time interval lie close 
to a plane in (u, v, w) space. It is therefore possible to analyze an observation 
lasting several hours as a series of short duration images, which are subsequently 
combined after adjustment of the coordinate scales. 


Practical implementation of the three approaches outlined above requires the 
nonlinear deconvolution techniques described in Chap. 11. A more detailed dis- 
cussion of the resulting methods is given in Sect. 11.7. 


3.2 Cross-Correlation and the Wiener—Khinchin Relation 


The Fourier transform relationship between the power spectrum of a waveform and 
its autocorrelation function, the Wiener—Khinchin relation, is expressed in Eqs. (2.6) 
and (2.7). It is also useful to examine the corresponding relation for the cross- 
correlation function of two different waveforms. The response of a correlator, as 
used in a radio interferometer, can be written as 


1 T 
r(t) = jim, af vove- T) dt , (3.19) 


where the superscript asterisk indicates the complex conjugate. In practice, the 
correlation is measured for a finite time period 2T, which is usually a few 
seconds or minutes but is long compared with both the period and the reciprocal 
bandwidth of the waveforms. The factor 1/2T is sometimes omitted, but for the 
waveforms considered here, it is required to obtain convergence. Cross-correlation 
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is represented by the pentagram symbol (x): 


T 
Vi (t) * V(t) = jim zS Vi(t)V5(t—t) dt. (3.20) 
PS: -T 


This integral can be expressed as a convolution in the following way: 
1 Cc 
Vi (1) * Vo(t) = lim =| VIV (t — t) dt = Vi(t) * V5_() , (3.21) 
T>00 2T J_oo 


where V2- (t) = V2(—t). Now the v, t Fourier transforms are as follows*: V; (f) <> 
Viv), V2) <> V2(v), and V3 (t) <> V3(v). Then from the convolution 
theorem, 


Vi (t) * nA > Vi(v) VE (v) . (3.22) 


The right side of Eq. (3.22) is known as the cross power spectrum of V;(¢) and 
V2(t). The cross power spectrum is a function of frequency, and we see that it is 
the Fourier transform of the cross-correlation, which is a function of t. This is a 
useful result, and in the case where V; = V2, it becomes the Wiener—Khinchin 
relation. The relationship expressed in Eq. (3.22) is the basis of cross-correlation 
spectrometry, described in Sect. 8.8.2. 


3.3 Basic Response of the Receiving System 


From a mathematical viewpoint, the basic components of the interferometer receiv- 
ing system are the antennas that transform the incident electric fields into voltage 
waveforms, the filters that select the frequency components to be processed, and 
the correlator that forms the averaged product of the signals. In the filter and the 
correlator, the signals may be in either analog or digital form. These components 
are shown in Fig. 3.6. Most other effects can be represented by multiplicative gain 
constants, which we shall ignore here, or as variations of the frequency response 
that can be subsumed into the expressions for the filters. Thus, we assume that 
the frequency response of the antennas and the strength of the received signal are 
effectively constant over the filter passband, which is realistic for many continuum 
observations. 


3In this chapter, in cases where the same letter is used for functions of both time and frequency, 
the circumflex (hat) accent is used to indicate functions of frequency. 
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Fig. 3.6 Basic components 
of the receiving system of a 
two-element interferometer. 


Correlator 


Output, r 


3.3.1 Antennas 


In order to consider the responses of the two antennas independently, we should 
introduce their voltage reception patterns, since the correlator responds to the 
product of the signal voltages. The voltage reception pattern of an antenna V, (J, m) 
has the dimension length and responds to the electric field specified in volts per 
meter. V4(/,m) is the Fourier transform of the field distribution in the aperture 
&(X, Y), as shown in Sect. 15.1.2. X and Y are coordinates of position within the 
antenna aperture. Omitting constant factors, we can write 


CO 
Va (I, m) x f i EX, Y) eP” O/mlay ay , (3.23) 
—oo 


where A is the wavelength. In applying Eq. (3.23), X and Y are measured from the 
center of each antenna aperture. The power reception pattern is proportional to the 
squared modulus of the voltage reception pattern. V4(/,m) is a complex quantity, 
and it represents the phase of the radio frequency voltage at the antenna terminals as 
well as the amplitude. For an interferometer (with antennas denoted by subscripts 1 
and 2), the response is proportional to V4; Vý, which is purely real if the antennas 
are identical. For each antenna, the collecting area A(/,m) is a real quantity. In 
practice, it is usual to specify the antenna response in terms of A(/, m) and to replace 
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Va(l,m) by yA(l, m), which is proportional to the modulus of V4(/, m). Any phase 
introduced by differences between the antennas is ignored in this analysis but in 
effect is combined with the phase responses of the amplifiers, filters, transmission 
lines, and other elements that make up the signal path to the correlator input. The 
overall instrumental response of the interferometer in both phase and amplitude is 
calibrated by observing an unresolved source of known position and flux density. 
For the case in which the antennas track the source, both the antenna beam center 
and the center of the source are at the (l, m) origin. If E(/,m) is the incident field, 
the output voltage of an antenna can be written (omitting constant gain factors) as 


V= f f 7 E(l, m) /A(l,m) dl dm . (3.24) 


If the antennas do not track the source, a convolution relationship of the form shown 
in Eq. (2.15) applies. 


3.3.2 Filters 


The filters in Fig. 3.6 will be regarded as a representation of the overall effect 
of components that determine the frequency response of the receiving channels, 
including amplifiers, cables, filters, and other components. The frequency response 
of a filter will be representedby H(v), which can also be called the bandpass 
function. The output of the filter V.(v) is related to the input Vo) by 


Ve(v) = H(v)V(v) . (3.25) 


The Fourier transform of H(v) with respect to time and frequency is the impulse 
response of the filter A(t), which is the response to a voltage impulse 6(t) at the 
input. Thus, in the time domain, the corresponding expression to Eq. (3.25) is 


V.() = Í > hANV(E =t) dt’ = h(t) * V(t) , (3.26) 


where the centerline asterisk represents convolution. In specifying filters, it is usual 
to use the frequency response rather than the impulse response because the former is 
more directly related to the properties of interest in a receiving system and is usually 
easier to measure. 
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3.3.3 Correlator 


The correlator* produces the cross-correlation of the two voltages fed to it. If Vi (1) 
and V2(f) are the input voltages, the correlator output is 


1 T 
r(t) = jim z) Vi (V5 (t— 1) dt , (3.27) 
—>0o -T 


where t is the time by which voltage V is delayed with respect to voltage V,. 
For continuum observations, t is maintained small or zero. The functions V; and 
V, that represent the signals may be complex. The output of a single multiplying 
device is a real voltage or number. To obtain the complex cross-correlation, which 
represents both the amplitude and the phase of the visibility, one can record the 
fringe oscillations and measure their phase, or use a complex correlator that contains 
two multiplying circuits, as described in Sect. 6.1.7. As follows from Eqs. (3.20) 
and (3.22), the Fourier transform of r(t) is the cross power spectrum, which is 
required in observations of spectral lines. This can be obtained by inserting a series 
of instrumental delays in the signal to determine the cross-correlation as a function 
of t, as described in Sect. 8.8.3. 


3.3.4 Response to the Incident Radiation 


We use subscripts | and 2 to indicate the two antennas and receiving channels as 
in Fig. 3.6. The response of antenna | to the signal field E(/, m) given by Eq. (3.24) 
is the voltage spectrum Vv). We multiply this by H(v) to obtain the signal at the 
output of the filter, and then take the Fourier transform to go from the frequency to 
the time domain. Thus 


Va (t) = J 7 f i f > E(l,m) yA (l, mM, (v)e?""" dldmdv . (3.28) 


A similar expression can be written for the signal V.(t) from antenna 2, and the 
output of the correlator is obtained from Eq. (3.27). Note also that if the radiation 
were to have some degree of spatial coherence, we should integrate over (l, m) 
independently for each antenna (Swenson and Mathur 1968), but here we make 


4The term correlator basically refers to a device that measures the complex cross-correlation 
function r(t), as given in Eq. (3.27). It is also used to denote simpler systems in which the time 
delay t is zero or where both signals are represented by real functions. Large systems that cross- 
correlate the signal pairs of multielement arrays may contain 10’ or more correlator circuits to 
accommodate many antennas and many spectral channels. Complete systems of this type are also 
commonly referred to as correlators. 
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the usual assumption of incoherence. Thus, the correlator output is 


r(t) = lim eff E] 


T—=>œ 2T 


x Hı (v) HZ (v)e?™ eP dldmdt dv 


-f7 L T I(l, m) V/A, (l, m)A2(l, mH (v) HE (v) e?"* dldmdv . 


(3.29) 


Here, we have replaced the squared field amplitude by the intensity 7. The result is a 
very general one since the use of separate response functions A; and A; for the two 
antennas can accommodate different antenna designs, or different pointing offset 
errors, or both. Also, different frequency responses H and H; are used. In the case 
in which the antennas and filters are identical, Eq. (3.29) becomes 


r(t) = / > f f 7 I(l, m)A(l, m)|H (v) e7” dldmdv . (3.30) 


The result is a function of the delay t of the signal V..(t) with respect to Ve (£). 
The geometric component of the delay is generally compensated by an adjustable 
instrumental delay (discussed in Chaps. 6 and 7), so that t = 0 for radiation from the 
direction of the (/, m) origin. For a wavefront incident from the direction (l, m), the 
difference in propagation times through the two antennas to the correlator results 
from a difference in path lengths of (ul + vm) wavelengths, for the conditions 
indicated in Eqs. (3.8) and (3.9). The corresponding time difference is (ul + vm)/v. 
If we take as V the signal from the antenna for which the path length is the greater 
(for positive / and m), then from Eq. (3.30), the correlator output becomes 


[0.6] [0.6] [0.0] : 
r= f / J (1, mA(l, m)|H(v) pe "+ dl dm dv . (3.31) 
=00 J =00 =00 


Equation (3.31) indicates that the correlator output measures the Fourier transform 
of the intensity distribution modified by the antenna pattern. Let us assume that, as 
is often the case, the intensity and the antenna pattern are constant over the bandpass 
range of the filters, and the width of the source is small compared with the antenna 
beam. The correlator output then becomes 


[o.e) CO g [0.6] 
r= | / I(l,m)A(L, me P™""*™) di dm f |H(v)|?dv 
—00 J —0O —oo 


= AV (u, Df |H) dv , (3.32) 
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where Ao is the collecting area of the antennas in the direction of the maximum 
beam response and ‘V is the visibility as in Eq. (3.2). The filter response H(v) is 
a dimensionless (gain) quantity. If the filter response is essentially constant over a 
bandwidth Av, Eq. (3.32) becomes 


r= AoV (u, v) Av . (3.33) 


V(u, v) has units of W m~? Hz~!, Ao has units of m?, and Av has units of Hz. 
This is consistent with r, the output of the correlator, which is proportional to the 
correlated component of the received power. 


Appendix 3.1 Mathematical Representation of Noiselike 
Signals 


Electromagnetic fields and voltage waveforms that result from the emissions of 
astronomical objects are generally characterized by variations of a random nature. 
The received waveforms are usually described as ergodic (time averages and 
ensemble averages converge to equal values), which implies strict stationarity. For 
a detailed discussion, see, for example, Goodman (1985). Although such fields and 
voltages are entirely real, it is often convenient to represent them mathematically 
as complex functions. These complex functions can be manipulated in exponential 
form, and it is then necessary to take the real part as a final step in a calculation. 


A3.1.1 Analytic Signal 


A formulation that is often used in optical and radio signal analysis to represent a 
function of time is known as the analytic signal, which was introduced by Gabor 
(1946): see, for example, Born and Wolf (1999), Bracewell (2000), or Goodman 
(1985). Let V(t) represent a real function of which the Fourier (voltage) spectrum is 


Do) = fo veea. (A3.1) 


The inverse transform is 
oo ~ P 
vo= f V(v) e?* dv . (A3.2) 
—oo 


To form the analytic signal, the imaginary part that is added to produce a complex 
function is the Hilbert transform [see, e.g., Bracewell (2000)] of Vr(t). One way 
of forming the Hilbert transform is to multiply the Fourier spectrum of the original 
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function by j sgn(v).° In forming the Hilbert transform of a function, the amplitudes 
of the Fourier spectral components are unchanged, but the phases are shifted by 
z/2, with the sign of the shift reversed for negative and positive frequencies. The 
Hilbert transform of Va(t), which becomes the imaginary part V;(t), is obtained as 
the inverse Fourier transform of the modified spectrum, as follows: 


Vi(t) = -f sgn(v)V(v) e?7"dv 


0 lee) 
=j I V(v) e?™ dv —j | V(v) e?™™dv . (A3.3) 
—oo 0 
The analytic signal is the complex function that represents Vr(f), and is 
VQ) = Vat) + Vi) 
0 a ; lee) a , 
= | a +V) edv + f (1 —/?)V(v) edv 
as 0 
& ~ ss 
=2 I V(v) e?* dv . (A3.4) 
0 


It can be seen that the analytic signal contains no negative-frequency components. 
From Eq. (A3.4), another way of obtaining the analytic signal for a real function 
Vr(t) is to suppress the negative-frequency components of the spectrum and double 
the amplitudes of the positive ones. It can also be shown [see, e.g., Born and Wolf 
(1999)] that 


(EVROP) = (VOP = VOVO), (A3.5) 


where angle brackets ( ) indicate the expectation. The analytic signal is so called 
because, considered as a function of a complex variable, it is analytic in the lower 
half of the complex plane. 

From Eqs. (A3.2) and (A3.4), we obtain 


| V(v) e?"™'dt = 2 Re | i Poema l (A3.6) 
a 0 


[o0] 


This is a useful equality that can be used with any Hermitian function and its 
conjugate variable. 


>The function sgn(v) is equal to 1 for v > 0 and —1 for v < 0. The Fourier transform of sgn(v) is 
—j/mt (see Appendix 2.1). 

6A Hermitian function is one in which the real part of the Fourier transform is an even function 
and the imaginary part is an odd function. 
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In many cases of interest in radio astronomy and optics, the bandwidth of a 
signal is small compared with the mean frequency vo, which in many instrumental 
situations is the center frequency of a filter. Such a waveform resembles a sinusoid 
with amplitude and phase that vary with time on a scale that is slow compared with 
the period 1/vo. The analytic signal can then be written as 


V(t) = C(A) eP 2O] (A3.7) 


where C and @ are real. The spectral components of the function under considera- 
tion are appreciable only for small values of |v — vo|. Thus, C(t) and (t) consist 
of low-frequency components, and the period of the time variation of C and @ is 
characteristically the reciprocal of the bandwidth. The real and imaginary parts of 
the analytic signal can be written as 


Vr(t) = C(t) cos[27 vot — P (t)] , (A3.8) 
Vi = C(t) sin[2 vot — B(t)] . (A3.9) 


The modulus C(t) of the complex analytic signal can be regarded as a modulation 
envelope, and ®(f) represents the phase. In cases where the width of the signal band 
and the effect of the modulation are not important, it is clearly possible to consider C 
and @ as constants, that is, to represent the signals as monochromatic waveforms of 
frequency vo, as in the introductory discussion. The case in which the bandwidth is 
small compared with the center frequency, as represented by Eq. (A3.7), is referred 
to as the quasi-monochromatic case. 

As a simple example, e/?"”' is the analytic signal corresponding to the real 
function of time cos(2mvt). The Fourier spectrum of e/?*”’ has a component at 
frequency v only, but the Fourier spectrum of cos(2mvt) has components at the 
two frequencies +v. In general, it is necessary to consider the negative-frequency 
components in the analysis of waveforms, unless they are represented by the 
analytic signal formulation, for which negative-frequency components are zero. 
For example, in Eq. (2.8), we included negative-frequency components. If we had 
omitted the negative frequencies and doubled the amplitude of the positive ones, the 
cosine term in Eq. (2.9) would have been replaced by e/?”"°t. We would then have 
taken the real part to arrive at the correct result. In the approach used in Chap. 2, it 
is necessary to include the negative frequencies since the autocorrelation function 
is purely real, and thus its Fourier transform is Hermitian. In this book, we have 
generally included the negative frequencies rather than using the analytic signal and 
have made use of the relationship in Eq. (A3.6) when it is advantageous to do so. 

It is interesting to note another property of functions of which the real and 
imaginary parts are a Hilbert transform pair. If the real and imaginary parts of a 
waveform (i.e., a function of time) are a Hilbert transform pair, then its spectral 
components are zero for negative frequencies. If we consider the inverse Fourier 
transforms, it is seen that if the waveform amplitude is zero for t < 0, the real and 
imaginary parts of the spectrum are a Hilbert transform pair. The response of any 
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electrical system to an impulse function applied at time £ = 0 is zero for t < 0, 
since an effect cannot precede its cause. A function representing such a response is 
referred to as a causal function, and the Hilbert transform relationship applies to its 
spectrum. 


A3.1.2 Truncated Function 


Another consideration in the representation of waveforms concerns the existence 
of the Fourier transform. A condition of the existence of the transform is that the 
Fourier integral over the range +00 be finite. Although this is not always the case, 
it is possible to form a function for which the Fourier transform exists and that 
approaches the original function as the value of some parameter tends toward a 
limit. For example, the original function can be multiplied by a Gaussian so that the 
product falls to zero at large values, and the Fourier integral exists. The Fourier 
transform of the product approaches that of the original function as the width 
of the Gaussian tends to infinity. Such transforms in the limit are applicable to 
periodic functions such as cos(2z vt), as shown by Bracewell (2000). In the case 
of noiselike waveforms, the frequency spectrum of a time function can always be 
determined with satisfactory accuracy by analyzing a sufficiently long (but finite) 
time interval. In practice, the time interval needs to be long compared with the 
physically significant timescales that are associated with the waveform, such as the 
reciprocals of the mean frequency and of the bandwidth. Thus, if the function V(t) 
is truncated at +7, the Fourier transform with respect to frequency becomes 


= 1 fT 
= fi —j2nvt 
Vv) = Jim = / YO ert ge (A3.10) 


It is sometimes useful to define the truncated function as V7(t), where 


Vre) = V), <T, 
Vr(t) = 0, >T, (A3.11) 


and to write the Fourier transform as 
A 1 Re a 
_—- Vj a! —j2mvt 
V(v) jim af vr e dt. (A3.12) 


In the case of the analytic signal, truncation of the real part does not necessarily 
result in truncation of its Hilbert transform. It may therefore be necessary that the 
limits of the integral over time be +o0ọ, as in Eq. (A3.12), rather than +T. 
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Chapter 4 
Geometrical Relationships, Polarimetry, 
and the Interferometer Measurement Equation 


In this chapter, we start to examine some of the practical aspects of interferometry. 
These include baselines, antenna mounts and beam shapes, and the response to 
polarized radiation, all of which involve geometric considerations and coordinate 
systems. The discussion is concentrated on Earth-based arrays with tracking 
antennas, which illustrate the principles involved, although the same principles 
apply to other systems such as those that include one or more antennas in Earth 
orbit. 


4.1 Antenna Spacing Coordinates and (u, v) Loci 


Various coordinate systems are used to specify the relative positions of the antennas 
in an array, and of these, one of the more convenient for terrestrial arrays is shown 
in Fig.4.1. A right-handed Cartesian coordinate system is used, where X and Y 
are measured in a plane parallel to the Earth’s equator, X in the meridian plane! 
(defined as the plane through the poles of the Earth and the reference point in the 
array), Y toward the east, and Z toward the north pole. In terms of hour angle H 
and declination 6, coordinates (X, Y, Z) are measured toward (H = 0,6 = 0), 
(H = —6",8 = 0), and (8 = 90°), respectively. If (X,, Y}, Z}) are the components 


lIn VLBI observations, it is customary to set the X axis in the Greenwich meridian, in which case 
H is measured with respect to that meridian rather than a local one. 
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Fig. 4.1 The (X,Y,Z) Z 
coordinate system for 
specification of relative 
positions of antennas. 
Directions of the axes 
specified are in terms of hour 
angle H and declination 6. 


(6 = 90°) 


Y 
of D; in the (X, Y, Z) system, the components (u, v, w) are given by 
u sin H cos H (0) Xj 
v | = | —sinĝcosH sinédsinH cosé Y, |. (4.1) 


cosécosH —cosésinH sind Zy 


Here (H, 6) are usually the hour angle and declination of the phase reference 
position. The elements of the transformation matrix given above are the direction 
cosines of the (u, v,w) axes with respect to the (X, Y, Z) axes and can easily be 
derived from the relationships in Fig. 4.2. Another method of specifying the baseline 
vector is in terms of its length, D, and the hour angle and declination, (A, d), of the 
intersection of the baseline direction with the Northern Celestial Hemisphere. The 
coordinates in the (X, Y, Z) system are then given by 


X cos d cos h 
Y | =D | —cosdsinh | . (4.2) 
Z sind 


The coordinates in the (u, v, w) system are, from Eqs. 4.1 and 4.2, 


cos d sin(H — h) 
v | =D, | sindcosé — cos d sin ô cos(H — h) | . (4.3) 
w sin d sin ô + cos d cos ô cos(H — h) 
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Celestial 
Pole 


YE 


Fig. 4.2 Relationships between the (X, Y, Z) and (u, v, w) coordinate systems. The (u, v, w) sys- 
tem is defined for observation in the direction of the point S, which has hour angle and declination 
H and 6. As shown, S is in the eastern half of the hemisphere and H is therefore negative. The 
direction cosines in the transformation matrix in Eq. (4.1) follow from the relationships in this 
diagram. The relationship in Eq. (4.2) can also be derived if we let S represent the direction of the 
baseline and put the baseline coordinates (A, d) for (H, 4). 
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The (D, h, d) system was used more widely in the earlier literature, particularly for 
instruments involving only two antennas; see, for example, Rowson (1963). 

When the (X, Y, Z) components of a new baseline are first established, the usual 
practice is to determine the elevation &, azimuth A, and length of the baseline by 
field surveying techniques. Figure 4.3 shows the relationship between (6, A) and 
other coordinate systems; see also Appendix 4.1. For latitude £, using Eqs. (4.2) 
and (A4.2), we obtain 


X cos Lsin &-— sin L cos Ecos A 
Y| =D cosésin A ; (4.4) 
Z sin £ sin E + cos L cos E cos A 


Examination of Eq. (4.1) or (4.3) shows that the locus of the projected antenna 
spacing components u and v defines an ellipse with hour angle as the variable. Let 
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Fig. 4.3 Relationship between the celestial coordinates (H, 6) and the elevation and azimuth 
(E€, A) of a point S as seen by an observer at latitude £. P is the celestial pole and Z the observer’s 
zenith. The parallactic angle y, is the position angle of the observer’s vertical on the sky measured 
from north toward east. The lengths of the arcs measured in terms of angles subtended at the center 
of the sphere O are as follows: 


ZP = 90°- £ PQ=L SR=& ROQO=A 
SZ=90°-& SP=90°—5 SQ = cos"! (cos €E cos A). 


The required relationships can be obtained by application of the sine and cosine rules for spherical 
triangles to ZPS and PQS and are given in Appendix 4.1. Note that with S in the eastern half of the 
observer’s sky, as shown, H and y, are negative. 


(Ho, 50) be the phase reference position. Then from Eq. (4.1), we have 


-Z 
Aa (: zy, 608 89 
sin ĝo 


2 
) =L Y, (4.5) 


In the (u, v) plane, Eq. (4.5) defines an ellipse? with the semimajor axis equal to 
VX? + Y?, and the semiminor axis equal to sin 04/ X? + Y?, as in Fig. 4.4a. The 


ellipse is centered on the v axis at (u, v) = (0, Z} cos ôo). The arc of the ellipse that 
is traced out during any observation depends on the azimuth, elevation, and latitude 
of the baseline; the declination of the source; and the range of hour angle covered, 


?The first mention of elliptical loci appears to have been by Rowson (1963). 
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Fig. 4.4 (a) Spacing vector locus in the (u, v) plane from Eq. (4.5). (b) Spacing vector locus in 
the (u’, v’) plane from Eq. (4.8). The lower arc in each diagram represents the locus of conjugate 
values of visibility. Unless the source is circumpolar, the cutoff at the horizon limits the lengths of 
the arcs. 


as illustrated in Fig.4.5. Since V(—u, —v) = V* (u, v), any observation supplies 
simultaneous measurements on two arcs, which are part of the same ellipse only if 
Z = 0. 


4.2 (u’,v’) Plane 


The (w, v’) plane, which was introduced in Sect. 3.1.2 with regard to east-west 
baselines, is also useful in discussing certain aspects of the behavior of arrays in 
general. This plane is normal to the direction of the pole and can be envisaged as the 
equatorial plane of the Earth. For non-east—west baselines, we can also consider the 
projection of the spacing vectors onto the (u’, v’) plane. All such projected vectors 
sweep out circular loci as the Earth rotates. The spacing components in the (u’, v’) 
plane are derived from those in the (u, v) plane by the transformation w = u, v’ = 
v cosec ĝo. In terms of the components of the baseline (X,, Y}, Z,) for two antennas, 
we obtain from Eq. (4.1) 


u! = X, sin Hp + Y, cos Ho (4.6) 
v’ = —X, cos Ho + Y, sin Ho + Z3 cotdy . (4.7) 
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Fig. 4.5 Examples of (u, v) loci to show the variation with baseline azimuth A and observing 
declination ô (the baseline elevation & is zero). The baseline length in all cases is equal to the 
length of the axes measured from the origin. The tracking range is —4 to +4 h for 5 = —30°, 
and —6 to +6 h in all other cases. Marks along the loci indicate 1-h intervals in tracking. Note the 
change in ellipticity for east-west baselines (A = 90°) with 6 = 30° and with 6 = 70°. The loci 
are calculated for latitude 40°. 


The loci are circles centered on (0, Z} cot ĝo), with radii q’ given by 
q? =u? + (v' —Z, coto)? = X? + ¥?, (4.8) 


as shown in Fig. 4.4b. The projected spacing vectors that generate the loci rotate 
with constant angular velocity we, the rotation velocity of the Earth, which is easier 
to visualize than the elliptic motion in the (u,v) plane. In particular, problems 
involving the effect of time, such as the averaging of visibility data, are conveniently 
dealt with in the (v’, v’) plane. Examples of its use will be found in Sects. 4.4, 6.4.2, 
and 16.3.2. In Fourier transformation, the conjugate variables of (u’, v’) are (l, m’), 
where // = J and m = msin ĝo, that is, the image plane is compressed by a factor 
sin ĉo in the m direction. 
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4.3 Fringe Frequency 


The component w of the baseline represents the path difference to the two antennas 
for a plane wave incident from the phase reference position. The corresponding time 
delay is w/vo, where vo is the center frequency of the observing band. The relative 
phase of the signals at the two antennas changes by 27 radians when w changes 
by unity. Thus, the frequency of the oscillations at the output of the correlator that 
combines the signals is 


d dw dH 
= = T Er = —we |X; cos ô sin H + Y, cos ô cos H] = —we u cos ô , (4.9) 
where œe = dH/dt = 7.29115 x 10~ rad s™! = œe is the rotation velocity 


of the Earth with respect to the fixed stars: for greater accuracy, see Seidelmann 
(1992). The sign of dw/dt indicates whether the phase is increasing or decreasing 
with time. The result shown above applies to the case in which the signals suffer no 
time-varying instrumental phase changes between the antennas and the correlator 
inputs. In an array in which the antennas track a source, time delays to compensate 
for the space path differences w are applied to maintain correlation of the signals. 
If an exact compensating delay were introduced in the radio frequency section of 
the receivers, the relative phases of the signals at the correlator input would remain 
constant, and the correlator output would show no fringes. However, except in some 
low-frequency systems like LOFAR (de Vos et al. 2009), the compensating delays 
are usually introduced at an intermediate frequency, of which the band center va 
is much less than the observing frequency vo. The adjustment of the compensating 


delay introduces a rate of phase change 27 vg(dw/dt)/vp = —@-u(cos b)vg/vo. The 
resulting fringe frequency at the correlator output is 
dw Va Va 
ve = — | 1 F — | = -æu coss | 1 F — ] , (4.10) 
dt vo Vo 


where the negative sign refers to upper-sideband reception and the positive sign 
to lower-sideband reception; these distinctions and the double-sideband case are 
explained in Sect. 6.1.8. From Eq. (4.3), the right side of Eq. (4.10) is equal to 
—a@.D cos d cos 6 sin(H — h)(vo F va)/c. Note that (vo F va) is usually determined 
by one or more local oscillator frequencies. 


4.4 Visibility Frequencies 


As explained in Sect. 3.1, the phase of the complex visibility is measured with 
respect to that of a hypothetical point source at the phase reference position. The 
fringe-frequency variations do not appear in the visibility function, but slower 
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Fig. 4.6 The (u, v’) plane showing sinusoidal corrugations that represent the visibility of a point 
source. For simplicity, only the real part of the visibility is included. The most rapid variation in 
the visibility is encountered at the point P, where the direction of the spacing locus is normal to 
the ridges in the visibility. œe is the rotation velocity of the Earth. 


variations occur that depend on the position of the radiating sources within the 
field. We now examine the maximum temporal frequency of the visibility variations. 
Consider a point source represented by the delta function 6(/,, mı). The visibility 
function is the Fourier transform of (l, mı), which is 


e Pru tum) — cos 2n(ul, + vmi) — j sin 27 (ulı + vm) . (4.11) 


This expression represents two sets of sinusoidal corrugations, one real and one 
imaginary. The corrugations represented by the real part of Eq. (4.11) are shown in 
(u’, v’) coordinates in Fig. 4.6, where the arguments of the trigonometric functions 
in Eq. (4.11) become 27(u'l,; + v'm; sin o). The frequency of the corrugations in 
terms of cycles per unit distance in the (u’, v’) plane is /; in the uv’ direction, mı sin ôo 


in the v’ direction, and 
ri = yE+tm sin? ĝo (4.12) 


in the direction of most rapid variations. Expression (4.12) is maximized at the pole 
and then becomes equal to r1, which is the angular distance of the source from the 
(L, m) origin. For any antenna pair, the spatial frequency locus in the (u’, v’) plane is 
a circle of radius q’ generated by a vector rotating with angular velocity w,, where 
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q' is as defined in Eq. (4.8). From Fig. 4.6, it is clear that the temporal variation of 
the measured visibility is greatest at the point P and is equal to œer{q'. This is a 
useful result, since if rı represents a position at the edge of the field to be imaged, 
it indicates that to follow the most rapid variations, the visibility must be sampled 
at time intervals sufficiently small compared with (@.r/,q’)~'. Also, we may wish to 
alternate between two frequencies or polarizations during an observation, and these 
changes must be made on a similarly short timescale. Note that this requirement is 
also covered by the sampling theorem in Sect. 5.2.1. 


4.5 Calibration of the Baseline 


The position parameters (X, Y, Z) for each antenna relative to a common reference 
point can usually be established to a few centimeters or millimeters by a conven- 
tional engineering survey. Except at long wavelengths, the accuracy required is 
greater than this. We must be able to compute the phase at any hour angle for a 
point source at the phase reference position to an accuracy of, say, 1° and subtract 
it from the observed phase. This reference phase is represented by the factor e/?”” 
in Eq. (3.7), and it is therefore necessary to calculate w to 1/360 of the observing 
wavelength. The baseline parameters can be obtained to the required accuracy from 
observations of calibration sources for which the positions are accurately known. 
The phase of such a calibrator observed at the phase reference position (Ho, 50) 
should ideally be zero. However, if practical uncertainties are taken into account, 
the measured phase is, from Eq. (4.1), 


2r Aw + din = 27 (cos ôo cos Hp AX, — cos do sin Ho AY, + sin 60421) + din , 
(4.13) 


where the prefix A indicates the uncertainty in the associated quantity, and din 
is an instrumental phase term for the two antennas involved. If a calibrator is 
observed over a wide range of hour angle, AX, and AY, can be obtained from 
the even and odd components, respectively, of the phase variation with Ho. To 
measure AZ), calibrators at more than one declination must be included. A possible 
procedure is to observe several calibrators at different declinations, repeating a 
cycle of observations for several hours. For the kth observation, we can write, from 
Eq. (4.13), 


a AX), + by AY, + ch AZ, + din = Qk , (4.14) 


where ax, bk, and cg are known source parameters, and ¢,; is the measured phase. 
The calibrator source position need not be accurately known since the phase 
measurements can be used to estimate both the source positions and the baselines. 
Techniques for this analysis are discussed in Sect. 12.2. In practice, the instrumental 
phase ¢i, will vary slowly with time: instrumental stability is discussed in Chap. 7. 
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Also, there will be atmospheric phase variations, which are discussed in Chap. 13. 
These effects set the final limit on the attainable accuracy in observing both 
calibrators and sources under investigation. 

Measurement of baseline parameters to an accuracy of order 1 part in 10’ (e.g., 
3 mm in 30 km) implies timing accuracy of order 10~’@,! ~ 1 ms. Timekeeping 
is discussed in Sects. 9.5.8 and 12.3.3. 


4.6 Antennas 


4.6.1 Antenna Mounts 


In discussing the dependence of the measured phase on the baseline components, 
we have ignored any effects introduced by the antennas, which is tantamount to 
assuming that the antennas are identical and their effects on the signals cancel out. 
This, however, is only approximately true. In most synthesis arrays, the antennas 
must have collecting areas of tens or hundreds of square meters for reasons of 
sensitivity. Except for dipole arrays at meter wavelengths, the antennas required are 
large structures that must be capable of accurately tracking a radio source across the 
sky. Tracking antennas are almost always constructed either on equatorial mounts 
(also called polar mounts) or on altazimuth mounts, as illustrated in Fig. 4.7. In 
an equatorial mount, the polar axis is parallel to the Earth’s axis of rotation, and 
tracking a source requires only that the antenna be turned about the polar axis at the 
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Fig. 4.7 Schematic diagrams of antennas on (a) equatorial (polar) and (b) altazimuth mounts. In 
the positions shown, the declination and elevation axes are normal to the plane of the page. In 
the equatorial mount, there is a distance D, between the two rotational axes, but in the altazimuth 
mount, the axes often intersect, as shown. 
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sidereal rate. Equatorial mounts are mechanically more difficult to construct than 
altazimuth ones and are found mainly on antennas built prior to the introduction of 
computers for control and coordinate conversion. 

In most tracking arrays used in radio astronomy, the antennas are circularly 
symmetrical reflectors. A desirable feature is that the axis of symmetry of the 
reflecting surface intersect both the rotation axes of the mount. If this is not the 
case, pointing motions will cause the antenna to have a component of motion along 
the direction of the beam. It is then necessary to take account of phase changes 
associated with small pointing corrections, which may differ from one antenna to 
another. In most antenna mounts, however, whether of equatorial or altazimuth type, 
the reflector axis intersects the rotation axes with sufficient precision that phase 
errors of this type are negligible. 

It is convenient but not essential that the two rotation axes of the mount intersect. 
The intersection point then provides an appropriate reference point for defining the 
baseline between antennas, since whatever direction in which the antenna points, 
its aperture plane is always the same distance from that point as measured along 
the axis of the beam. In most large equatorially mounted antennas, the polar and 
declination axes do not intersect. In many cases, there is an offset of several meters 
between the polar and declination axes. Wade (1970) considered the implication of 
this offset for high-accuracy phase measurements and showed that it is necessary to 
take account of variations in the offset distance and in the accuracy of alignment 
of the polar axis. These results can be obtained as follows. Let i and s be unit 
vectors in the direction of the polar axis and the direction of the source under 
observation, respectively, and let D, be the spacing vector between the two axes 
measured perpendicular to i (see Fig. 4.7a). The quantity that we need to compute is 
the projection of D, in the direction of observation, D, -s. Since D, is perpendicular 
to i, the cosine of the angle between D, and s is y 1 — (i-s)?. Thus, 


Da -s = Da V1 = (ies)? , (4.15) 


where D, is the magnitude of D,. In the (X, Y, Z) coordinate system in which the 
baseline components are measured, i has direction cosines (ix, iy, iz), and s has 
direction cosines given by the transformation matrix on the right side of Eq. (4.2), 
but with h and d replaced by H and ô, which refer to the direction of observation. 
If the polar axis is correctly aligned to within about 1 arcmin, iy and iy are of order 
107° and iz ~ 1. Thus, we can use the direction cosines to evaluate Eq. (4.15), and 
ignoring second-order terms in iy and iy, we obtain 


D, +s = D,(cos6 — ix sin ô cos H + iy sind sin H) . (4.16) 


If the magnitude of D, is expressed in wavelengths, the difference in the values of 
D, -s for the two antennas must be added to the w component of the baseline given 
by Eq. (4.1) when calculating the reference phase at the field center. To do this, it is 
first necessary to determine the unknown constants in Eq. (4.16), which can be done 
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by adding a term of the form 27 (œ cos ĝo + p sin ĉo cos Ho + y sin do sin Ho) to the 
right side of Eq. (4.13) and extending the solution to include a, 8, and y. The result 
then represents the differences in the corresponding mechanical dimensions of the 
two antennas. Note that the terms in iy and iy in Eq. (4.16) are important only when 
D, is large. If Dg is no more than one wavelength, it should be possible to ignore 
them. 

The preceding analysis can be extended to the case of an altazimuth mount 
by letting i represent the direction of the azimuth axis, as in Fig.4.7b. Then 
iy = cos(£ + €), iy = sin z’, and iz = sin(£L + £), where £ is the latitude and € 
and z’ are, respectively, the tilt errors in the XZ plane and in the plane containing the 
Y axis and the local vertical. The errors again should be quantities of order 107°. In 
many altazimuth mounts, the axes are designed to intersect, and D, represents only 
a structural tolerance. Thus, we assume that D, is small enough to allow terms in 
iyD, and eD, to be ignored, and evaluation of Eq. (4.15) gives 


D.s = Da [1 — (sin £sind + cos £ cos ô cos H)’ | = D,cosé, (4.17) 


where & is the elevation of direction s: see Eq. (A4.1) of Appendix 4.1. Correction 
terms of this form can be added to the expressions for the baseline calibration and 
for w. 


4.6.2 Beamwidth and Beam-Shape Effects 


The interpretation of data taken with arrays containing antennas with nonidentical 
beamwidths is not always a straightforward matter. Each antenna pair responds 
to an effective intensity distribution that is the product of the actual intensity 
of the sky and the geometric mean of the normalized beam profiles. If different 
pairs of antennas respond to different effective distributions, then, in principle, the 
Fourier transform relationship between /(/,m) and V(u, v) cannot be applied to the 
ensemble of observations. Mixed arrays are sometimes used in VLBI when it is 
necessary to make use of antennas that have different designs. However, in VLBI 
studies, the source structure under investigation is very small compared with the 
widths of the antenna beams, so the differences in the beams can usually be ignored. 
If cases arise in which different beams are used and the source is not small compared 
with beamwidths, it is possible to restrict the measurements to the field defined by 
the narrowest beam by convolution of the visibility data with an appropriate function 
in the (u, v) plane. 

A problem similar to that of unmatched beams occurs if the antennas have alt- 
azimuth mounts and the beam contours are not circularly symmetrical about the 
nominal beam axis. As a point in the sky is tracked using an altazimuth mount, the 
beam rotates with respect to the sky about this nominal axis. This rotation does not 
occur for equatorial mounts. The angle between the vertical at the antenna and the 
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direction of north at the point being observed (defined by the great circle through 
the point and the North Pole) is the parallactic angle y, in Fig. 4.3. Application of 
the sine rule to the spherical triangle ZPS gives 


—siny, —snH snA 
= oe 4.18 
cos L cos& cos ô ( ) 


which can be combined with Eq. (A4.1) or (A4.2) to express y, as a function of 
(A,&) or (H,6). If the beam has elongated contours and width comparable to 
the source under observation, rotation of the beam causes the effective intensity 
distribution to vary with hour angle. This is particularly serious in the case 
of observations to reveal the structure of the most distant Universe, for which 
foreground sources need to be accurately removed. For the Australia Pathfinder 
Array (DeBoer et al. 2009), the 12-m-diameter antennas have altazimuth mounts, 
with a third axis that allows the reflector, feed supports, and feeds to be rotated 
about the reflector axis so the beam pattern and the angle of polarization remain 
fixed relative to the sky. 


4.7 Polarimetry 


Polarization measurements are very important in radio astronomy. Most synchrotron 
radiation shows a small degree of polarization that indicates the distribution of the 
magnetic fields within the source. As noted in Chap. 1, this polarization is generally 
linear (plane) and can vary in magnitude and position angle over the source. As 
frequency is increased, the percentage polarization often increases because the 
depolarizing action of Faraday rotation is reduced. Polarization of radio emission 
also results from the Zeeman effect in atoms and molecules, cyclotron radiation and 
plasma oscillations in the solar atmosphere, and Brewster angle effects at planetary 
surfaces. The measure of polarization that is almost universally used in astronomy 
is the set of four parameters introduced by Sir George Stokes in 1852. It is assumed 
here that readers have some familiarity with the concept of Stokes parameters or 
can refer to one of numerous texts that describe them [e.g., Born and Wolf (1999); 
Kraus and Carver (1973); Wilson et al. (2013)]. 

Stokes parameters are related to the amplitudes of the components of the electric 
field, E, and Ey, resolved in two perpendicular directions normal to the direction 
of propagation. Thus, if E, and E, are represented by E(t) cos[2mvt + 6,(f)] and 
E(t) cos[2r vt + ôy(t)], respectively, Stokes parameters are defined as follows: 


I = (EA) + (5) 

Q = (E0) — (E50) 

U=2 (AO) &(t) cos EO) — 5,(2)]) 

V = 2(E,(1) EA) sin[5,(1) — 6,(9)]) . (4.19) 
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where the angular brackets denote the expectation or time average. This averaging is 
necessary because in radio astronomy, we are dealing with fields that vary with time 
in a random manner. Of the four parameters, J is a measure of the total intensity of 
the wave, Q and U represent the linearly polarized component, and V represents the 
circularly polarized component. Stokes parameters can be converted to a measure 
of polarization with a more direct physical interpretation as follows: 


JEFE 


ee ee 4.20 
me T ( ) 
= v (4.21) 

Me = T ; 

/02 24 V2 
m = v+u +v (4.22) 
I 
1 

0= 5 tan (5) : 0<6<nz, (4.23) 


where me, me, and m, are the degrees of linear, circular, and total polarization, 
respectively, and 0 is the position angle of the plane of linear polarization. For 
monochromatic signals, m; = 1 and the polarization can be fully specified by just 
three parameters. For random signals such as those of cosmic origin, m, < 1, and all 
four parameters are required. The Stokes parameters all have the dimensions of flux 
density or intensity, and they propagate in the same manner as the electromagnetic 
field. Thus, they can be determined by measurement or calculation at any point along 
a wave path, and their relative magnitudes define the state of polarization at that 
point. Stokes parameters combine additively for independent waves. When they are 
used to specify the total radiation from any point on a source, Z, which measures the 
total intensity, is always positive, but Q, U, and V can take both positive and negative 
values depending on the position angle or sense of rotation of the polarization. 
The corresponding visibility values measured with an interferometer are complex 
quantities, as will be discussed later. 

In considering the response of interferometers and arrays, up to this point we 
have ignored the question of polarization. This simplification can be justified by 
the assumption that we have been dealing with completely unpolarized radiation for 
which only the parameter / is nonzero. In that case, the response of an interferometer 
with identically polarized antennas is proportional to the total flux density of the 
radiation. As will be shown below, in the more general case, the response is 
proportional to a linear combination of two or more Stokes parameters, where 
the combination is determined by the polarizations of the antennas. By observing 
with different states of polarization of the antennas, it is possible to separate the 
responses to the four parameters and determine the corresponding components of 
the visibility. The variation of each parameter over the source can thus be imaged 
individually, and the polarization of the radiation emitted at any point can be 
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determined. There are alternative methods of describing the polarization state of 
a wave, of which the coherency matrix is perhaps the most important (Ko 1967a,b). 
However, the classical treatment in terms of Stokes parameters remains widely used 
by astronomers, and we therefore follow it here. 


4.7.1 Antenna Polarization Ellipse 


The polarization of an antenna in either transmission or reception can be described 
in general by stating that the electric vector of a transmitted signal traces out an 
elliptical locus in the wavefront plane. Most antennas are designed so that the ellipse 
approximates a line or circle, corresponding to linear or circular polarization, in the 
central part of the main beam. However, precisely linear or circular responses are 
hardly achievable in practice. As shown in Fig. 4.8, the essential characteristics of 
the polarization ellipse are given by the position angle y of the major axis, and by 


(a) (b) 
North 


Fig. 4.8 (a) Description of the general state of polarization of an antenna in terms of the 
characteristics of the ellipse generated by the electric vector in the transmission of a sinusoidal 
signal. The position angle w of the major axis is measured with respect to the x axis, which 
points toward the direction of north on the sky. A wave approaching from the sky is traveling 
toward the reader, in the direction of the positive z axis. For such a wave, the arrow on the ellipse 
indicates the direction of right-handed polarization. (b) Model antenna that radiates the electric 
field represented by the ellipse in (a) when a signal is applied to the terminal A. Cos x and sin y 
indicate the amplitudes of the voltage responses of the units shown, and 7/2 indicates a phase lag. 
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the axial ratio, which it is convenient to express as the tangent of an angle y, where 
—m/4<y<7/4. 

An antenna of arbitrary polarization can be modeled in terms of two idealized 
dipoles as shown in Fig. 4.8b. Consider transmitting with this antenna by applying a 
signal waveform to the terminal A. The signals to the dipoles pass through networks 
with voltage responses proportional to cos y and sin y, and the signal to the y’ dipole 
also passes through a network that introduces a 7/2 phase lag. Thus, the antenna 
produces field components of amplitude Ey and &, in phase quadrature along the 
directions of the major and minor axes of the ellipse. If the antenna input is a radio 
frequency sine wave Vo cos 27r vt, then the field components are 


Sy cos(2vt) « Vocos x cos(2r vt) 
(4.24) 
Ey sin(2avt) ox Vosin x sin(2zvt) . 


In these equations, the y’ component lags the x’ component by 2/2. If y = 2/4, 
the radiated electric vector traces a circular locus with the sense of rotation from the 
x axis to the y' axis (i.e., counterclockwise in Fig. 4.8a). This is consistent with the 
quarter-cycle delay in the signal to the y dipole. Then a wave propagating in the 
positive z’ direction of a right-handed coordinate system (i.e., toward the reader in 
Fig. 4.8a) is right circularly polarized in the IEEE (1977) definition. (This definition 
is now widely adopted, but in some of the older literature, such a wave would be 
defined as left circularly polarized.) The International Astronomical Union (IAU 
1974) has adopted the IEEE definition and states that the position angle of the 
electric vector on the sky should be measured from north through east with reference 
to the system of right ascension and declination. The IAU also states that “the 
polarization of incoming radiation, for which the position angle, 6, of the electric 
vector, measured at a fixed point in space, increases with time, is described as right- 
handed and positive.” Note that Stokes parameters in Eq. (4.19) specify only the 
field in the (x, y) plane, and to determine whether a circularly polarized wave is left- 
or right-handed, the direction of propagation must be given. From Eq. (4.19) and 
the definitions of E, and E, that precede them, a wave traveling in the positive z 
direction in right-handed coordinates is right circularly polarized for positive V. 

In reception, an electric vector that rotates in a clockwise direction in Fig. 4.8 
produces a voltage in the y’ dipole that leads the voltage in the x’ dipole by z/2 in 
phase, and the two signals therefore combine in phase at A. For counterclockwise 
rotation, the signals at A are in antiphase and cancel one another. Thus, the antenna 
in Fig. 4.8 receives right-handed waves incident from the positive z direction (that 
is, traveling toward negative z), and it transmits right-handed polarization in the 
direction toward positive z. To receive a right-handed wave propagating down from 
the sky (in the positive z direction), the polarity of one of the dipoles must be 
reversed, which requires that y = —7/4. 
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To determine the interferometer response, we begin by considering the output of 
the antenna modeled in Fig. 4.8b. We define the field components in complex form: 


E,() = a 


ar (4.25) 
E(t) = E(f) FPO | 
The signal voltage received at A in Fig. 4.8b, expressed in complex form, is 
V’ = Ey cosy — jEy sin x , (4.26) 


where the factor —j represents the 2/2 phase lag applied to the y’ signal, for the 
fields represented by Eq. (4.25). Now we need to specify the polarization of the 
incident wave in terms of Stokes parameters. In accordance with IAU (1974), the 
axes used are in the directions of north and east on the sky, which are represented 
by x and y in Fig. 4.8a. In terms of the field in the x and y directions, the components 
of the field in the x’ and y’ directions are 


Ey (t) = [E() e® cos y + EA) e” sin y] e?™™ 
(4.27) 
Ey(t) = [E4 e”* sin y + EA) eO cosy] ec?" . 


Derivation of the response at the output of the correlator for antennas m and n of an 
array involves straightforward manipulation of some rather lengthy expressions that 
are not reproduced here. The steps are as follows: 


1. Substitute Ey and Ey from Eq. (4.27) into Eq. (4.26) to obtain the output of each 
antenna. 

2. Indicate values of y, x, and V’ for the two antennas by subscripts m and n and 
calculate the correlator output, Rmn = Ginn (Vi, Vv. where Gmn is an instrumental 
gain factor. 

3. Substitute Stokes parameters for Ex, Ey, ôx, ôy using Eq. (4.19) as follows: 


(E,e"*)(E,e%*)*) = (62) = (1 + Q) 


( 
(Eye) (Eye™)") = (8) = 30-9) 

l l (4.28) 
(Exe™)Eye)*) = ( 
( 


( 
( 
( 
( 
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The result is 


Rinn = 5 Ginn hy [cos(Win = Wn) COS(Xm = Xn) + jsin(Win = Wn) sin(Xm + Xn)] 
+ Qy [cos(Ym + Yn) COS(Xm + Xn) +j sin(Ym + Vn) sin(Xm— Xn)] 
+ U, [sin (Ym + Wn) COS( Xm + Xn) — jcos(Wn =F Wn) sin( Xin = Xn) 


— Vy [cos(Ym — Wn) sin(Xm + Xn) +j sin(Ym — Wn) cOS(Xm — Xn)]} - 
(4.29) 


In this equation, a subscript v has been added to Stokes parameter symbols 
to indicate that they represent the complex visibility for the distribution of the 
corresponding parameter over the source, not simply the intensity or brightness of 
the radiation. Equation (4.29) is a useful general formula that applies to all cases. 
It was originally derived by Morris et al. (1964) and later by Weiler (1973). In the 
derivation by Morris et al., the sign of V, is opposite to that given by Weiler and in 
Eq. (4.29). This difference results from the convention for the sense of rotation for 
circular polarization. In the convention we have followed in Fig. 4.8, two identical 
antennas both adjusted to receive right circularly polarized radiation would have 
parameters Ym = W, and Xm = Xn = —7/4. In Eq. (4.29), these values correspond 
to a positive sign for V,. Thus, in Eq. (4.29), positive V, represents right circular 
polarization incident from the sky, which is in agreement with the IAU definition 
in 1973 (IAU 1974). The derivation by Morris et al. predates the IAU definition 
and follows the commonly used convention at that time, in which the sign for 
V was the reverse of that in the IAU definition. Note that in what follows, the 
factor 1/2 in Eq. (4.29) is omitted and considered to be subsumed within the overall 
gain factor. Equation (4.29) was the main basis for polarization measurements in 
radio interferometry for at least three decades until an alternative formulation was 
developed by Hamaker et al. (1996). This later formulation is introduced in Sect. 4.8. 


4.7.2 Stokes Visibilities 


As noted above, the symbols /,, Q,, U», and V, in Eq. (4.29) refer to the correspond- 
ing visibility values as measured by the spaced antennas. We shall therefore refer to 
these quantities as Stokes visibilities, following the nomenclature of Hamaker et al. 
(1996). Stokes visibilities are the quantities required in imaging polarized emission, 
and they can be derived from the correlator output values by using Eq. (4.29). This 
equation is considerably simplified when the nominal polarization characteristics of 
practical antennas are inserted. First, consider the case in which both antennas are 
identically polarized. Then Xm = Xn, Ym = Wn, and Eq. (4.29) becomes 


Rmn = Gmn |lu +Qv COS 2Win COS 2¥%m+ Uy Sin 2Wm COS2Y¥m—Vy sin 2m]. (4.30) 
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In considering linearly polarized antennas, it is convenient to use subscripts x and 
y to indicate two orthogonal planes of polarization. For example, R,, represents the 
correlator output for antenna m with polarization x and antenna n with polarization 
y. For linearly polarized antennas, 7, = Xn = 0. Consider two antennas, each with 
separate outputs for linear polarizations x and y. Then for parallel polarizations, 
omitting gain constants, we obtain from Eq. (4.30) 


Ry = Ty + Qu cos 2Wy + Uy sin2Wn, . (4.31) 


Here, Yn is the position angle of the antenna polarization measured from celestial 
north in the direction of east. The y polarization angle is equal to the x polarization 
angle plus 7/2. For Ym equal to 0°, 45°, 90°, and 135°, the output Rw» is proportional 
to 7, + Q), (b + Uy), Uy — Q»), and (I, — U,), respectively. By using antennas 
with these polarization angles, I, Q,, and U,, but not V,, can be measured. In 
many cases, circular polarization is negligibly small, and the inability to measure 
V, is not a serious problem. However, Q, and U, are often only a few percent 
of 7,, and in attempting to measure them with identical feeds, one faces the usual 
problems of measuring a small difference in two much larger quantities. The same 
is true if one attempts to measure V, using identical circular feeds for which y = 
+7 /4 and the response is proportional to (J, F V,). These problems are reduced 
by using oppositely polarized feeds to measure Q,, U,, or V,. For an example of 
measurement of V,, see Weiler and Raimond (1976). 

With oppositely polarized feeds, we insert in Eq. (4.29) Y, = Wn + 2/2, 
and ¥m = —Xn. For linear polarization, the y terms are zero and the planes of 
polarization orthogonal. The antennas are then described as cross-polarized, as 
typified by crossed dipoles. Omitting constant gain factors and using the x and y 
subscripts defined above, we obtain for the correlator output 


Ry = —Qy sin2Wm + Uy cos 2m + jVo 
Ryx = —O» sin 2Wn + Uy Cos 2Win — JV j 


(4.32) 


where Wm refers to the angle of the plane of polarization in the direction (x or y) 
indicated by the first subscript of the R term in the same equation. Then for Wm 
equal to 0° and 45°, the Ry response is proportional to (U, + jV») and (—Q, +jV»). 
If V, is assumed to be zero, this suffices to measure the polarized component. If both 
antennas provide outputs for cross-polarized signals, the outputs of which go to two 
separate receiving channels at each antenna, four correlators can be used for each 
antenna pair. These provide responses for both crossed and parallel pairs, as listed in 
Table 4.1. Thus, if the planes of polarization can be periodically rotated through 45° 
as indicated by position angles I and II in Table 4.1, for example, by rotating antenna 
feeds, then Q,, U,, and V, can be measured without taking differences between 
responses involving /,. The use of rotating feeds has, however, proved to be of 
limited practicality. Rotating the feed relative to the main reflector is likely to have 
a small but significant effect on the beam shape and polarization properties. This is 
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Table 4.1 Stokes visibilities vs. position angles 


Position angles 
Stokes visibilities 


m n measured 

0° 0° h +Q, Position angle I 
0° 90° U, +iV, » 

90° 0° U, — jV, » 

90° 90° I, — Q, z 

45° 45° I, + U, Position angle II 
45° 135° —Q, + jVv ? 

135° 45° —Q, —jV» i 


135° 135° h-—-U, i 


because the rotation will cause deviations from circular symmetry in the radiation 
pattern of the feeds to interact differently with the shadowing effects of the focal 
support structure and any departures from circular symmetry in the main reflector. 
Furthermore, in radio astronomy systems designed for the greatest sensitivity, the 
feed together with the low-noise amplifiers and a cryogenically refrigerated Dewar 
are often built as one monolithic unit that cannot easily be rotated. However, for 
antennas on altazimuth mounts, the variation of the parallactic angle with hour 
angle causes the antenna response pattern to rotate on the sky as a source is 
tracked in hour angle. Conway and Kronberg (1969) pointed out this advantage 
of altazimuth mounts, which enables instrumental effects to be distinguished from 
the true polarization of the source if observations continue for a period of several 
hours. 

An example of a different arrangement of linearly polarized feeds, which has 
been used at the Westerbork Synthesis Radio Telescope, is described by Weiler 
(1973). The antennas are equatorially mounted and the parallactic angle of the 
polarization remains fixed as a source is tracked. The outputs of the antennas that 
are movable on rail track are correlated with those from the antennas in fixed 
locations. Table 4.2 shows the measurements when the position angles of the planes 
of polarization for the movable antennas are 45° and 135° and those of the fixed 
antennas 0° and 90°. Although the responses are reduced by a factor of /2 relative 
to those in Table 4.1, there is no loss in sensitivity since each Stokes visibility 
appears at all four correlator outputs. Note that since only signals from antennas 
with different polarization configurations are cross-correlated, this scheme does not 
make use of all possible polarization products. 

Opposite circularly polarized feeds offer certain advantages for measurements of 
linear polarization. In determining the responses, an arbitrary position angle Yn for 
antenna m is included to represent the effect of rotation caused, for example, by an 
altazimuth antenna mount. If the antennas provide simultaneous outputs for opposite 


4.7 Polarimetry 129 


Table 4.2 Stokes visibilities vs. position angles 


Position angles 


m n Stokes visibilities measured 
0° 45° (Ly + Qs + Uy + jV,)/ V2 
o 135 (=h — Q, + U, +jV,)/ V2 
90° 45° (ly — Qs + Uy — jVo)/ V2 


90° 135° (ly — Qs — Us + jV,)/V2 


senses of rotation (denoted by r and £) and four correlation products are generated 
for each antenna pair, the outputs are proportional to the quantities in Table 4.3. 

Here, we have made we = w, + 2/2, and y = —7/4 for right circular 
polarization and y = 2/4 for left circular. The feeds need not be rotated during 
an observation, and the responses to Q, and U, are separated from those to 7,. The 
expressions in Table 4.3 can be simplified by choosing values of y, such as 2/2, 
z/4, or 0. For example, if y, = 0, the sum of the r and £r responses is a measure 
of Stokes visibility U,. Again, the effects of the rotation of the position angle with 
altazimuth mounts must be taken into account. Conway and Kronberg (1969) appear 
to have been the first to use an interferometer with circularly polarized antennas 
to measure linear polarization in weakly polarized sources. Circularly polarized 
antennas have since been commonly used in radio astronomy. 


4.7.3. Instrumental Polarization 


The responses with the various combinations of linearly and circularly polarized 
antennas discussed above are derived on the assumption that the polarization is 
exactly linear or circular and that the position angles of the linear feeds are exactly 
determined. This is not the case in practice, and the polarization ellipse can never 
be maintained as a perfect circle or straight line. The nonideal characteristics of the 
antennas cause an unpolarized source to appear polarized and are therefore referred 
to as instrumental polarization. The effect of these deviations from ideal behavior 


Table 4.3 Stokes visibilities vs. sense of rotation 


Sense of rotation 


m n Stokes visibilities measured 
r r I, + Vy 

r d CQ, + Ure 

t r CjO, — Upe” 

£ £ I, —Vy 
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can be calculated from Eq. (4.29) if the deviations are known. In the expressions in 
Tables 4.1—4.3, the responses given are only the major terms, and if the instrumental 
terms are included, all four Stokes visibilities are, in general, involved. For example, 
consider the case of crossed linear feeds with nominal position angles 0° and 
90°. Let the actual values of y and y be such that (Wy + Wy) = 2/2 + Ayt, 
(We — Wy) = —0/24+ AYT, Xx + Xy = AX", and xx — Xy = AX. Then from 
Eq. (4.29), 


Ry > (AYT —jAx*) — Q (Ayt —jAx-) + Uv + iV . (4.33) 


Generally, antennas can be adjusted so that the A terms are no more than ~ 1°, 
and here we have assumed that they are small enough that their cosines can be 
approximated by unity, their sines by the angles, and products of two sines by 
zero. Instrumental polarization is often different for the antennas even if they are 
structurally similar, and corrections must be made to the visibility data before they 
are combined into an image. 

Although we have derived expressions for deviations of the antenna polarizations 
from the ideal in terms of the ellipticity and orientation of the polarization ellipse in 
Eq. (4.29), it is not necessary to know these parameters for the antennas so long as 
it is possible to remove the instrumental effects from the measurements, so that they 
do not appear in the final image. In calibrating the antenna responses, an approach 
that is widely preferred is to specify the instrumental polarization in terms of the 
response of the antenna to a wave of polarization that is orthogonal or opposite- 
handed with respect to the nominal antenna response. Thus, for linearly polarized 
antennas, following the analysis of Sault et al. (1991), we can write 


U, = Uy + Dyvy and vy = vy + Dyv, , (4.34) 


where subscripts x and y indicate two orthogonal planes of polarization, the v’ terms 
indicate the signal received, the v terms indicate the signal that would be received 
with an ideally polarized antenna, and the D terms indicate the response of the real 
antenna to the polarization orthogonal to the nominal polarization. The D terms 
are often described as the leakage of the orthogonal polarization into the antenna 
(Bignell 1982) and represent the instrumental polarization. For each polarization 
state, the leakage is specified by one complex number, that is, the same number of 
terms as the two real numbers required to specify the ellipticity and orientation of 
the polarization ellipse. In Appendix 4.2, expressions for D, and Dy are derived in 
terms of the parameters of the polarization ellipse: 


D; ~ We —jxx, and Dy ~ =Y +j% (4.35) 


where the approximations are valid for small values of the y and w parameters. 
Note that in Eq. (4.35), Yy is measured with respect to the y direction. For an ideal 
linearly polarized antenna, x, and yy are both zero, and the polarization in the x and 
y planes is precisely aligned with, and orthogonal to, the x direction with respect 
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to the antenna. Thus, for an ideal antenna, y, and yy, are also zero. For a practical 
antenna, the terms in Eq. (4.35) represent limits of accuracy in the hardware, and 
we see that the real and imaginary parts of the leakage terms can be related to the 
misalignment and ellipticity, respectively. 

For a pair of antennas m and n, the leakage terms allow us to express the measured 
correlator outputs Ri, Rj, Riy, and Ri, in terms of the unprimed quantities that 
represent the corresponding correlations as they would be measured with ideally 
polarized antennas: 


Ri x/ (8x8 in) = Ry F DymRyx F D% Rwy + DymDŽ Ryy 
Rigi (xm8yn) = Ry + DymRyy + D vn Rx T DimDy,Ryx 
Ky (8ym8 in) = Ryx + D ym xx ote D in Ryy + D ymD nR 


Rf (Sym8 sn) = Ryy + DymRay + DyRyx + DymDy,Rax i 


(4.36) 


The g terms represent the voltage gains of the corresponding signal channels. They 
are complex quantities representing amplitude and phase, and the equations can 
be normalized so that the values of the individual g terms do not differ greatly 
from unity. Note that Eq. (4.36) contain no small-term approximations. However, 
the leakage terms are typically no more than a few percent, and products of two such 
terms will be omitted at this point. Then, from Eqs. (4.31) and (4.32), the responses 
can be written in terms of the Stokes visibilities as follows: 


Ryx/ (SxmB in) = Io + Qu[cos 2Ym — (Dim + D¥n) Sin 2Y] 
+ U, [sin 2Wm + (Dim + Dž) cos 2Win] — JVo (Dum — D%,) 
Riy/(8xm85n) = 1o (Drm + Diy) — Qo[8in 2Win + (Dim — Dyn) COS 2Y] 
+ U, [cos 2m — (Dum — Din) Sin 2Ym] + jVo 
Ryx/ (Byman) = Lv (Dym + D3,) — Qolsin 2m — (Dym — Dip) COS 2Ym] 
+ U, [cos 2m + (Dym — D¥,) sin 2m] — Vo 
Ryy/(8ym85n) = Iv — Qu[cos 2m + (Dym + DY) sin 2Vin] 


= U, [sin 2Win a (Dym F Din) cos 2Vmni + JV» (Dym — Din) . 
(4.37) 


Note that Ym refers to the polarization (x or y) indicated by the first of the two 
subscripts of the R’ term in the same equation. Sault et al. (1991) describe Eq. (4.37) 
as representing the strongly polarized case. In deriving them, no restriction was 
placed on the magnitudes of the Stokes visibility terms, but the leakage terms 
of the antennas are assumed to be small. In the case where the source is only 
weakly polarized, the products of Q,, Uy, and V, with leakage terms can be omitted. 
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Equation (4.37) then become 


Rix/(Bxm8¥n) = Iv + Qu COS 2m + Uy sin 2Ym 
Ri,/ (8m8) = Io (Drm + D¥,) — Qy Sin 2m + Uy COS2Yim + jVy 
Ry | (8ym8in) = Iy (Dym + DŽ.) — Qy Sin 2Wm + Uy cos 2Wm — jVo 
Riy/ (8ym8 yn) = Iv — Qu COS 2m — Uy sin 24n . 


(4.38) 


If the antennas are operating well within the upper frequency limit of their 
performance, the polarization terms can be expected to remain largely constant with 
time since gravitational deflections that vary with pointing should be small. The 
instrumental gain terms can contain components due to the atmosphere, which may 
vary on time scales of seconds or minutes, and they also include any effects of the 
receiver electronics. 

In the case of circularly polarized antennas, leakage terms can also be defined 
and similar expressions for the instrumental response derived. The leakage terms 
are given by the following equations: 


vi = v, + D,ve and v, = ve + Dev, , (4.39) 


where, as before, the v’ terms are the measured signal voltages, the unprimed v 
terms are the signals that would be observed with an ideally polarized antenna, and 
the D terms are the leakages. The subscripts r and £ indicate the right and left senses 
of rotation. Again, the relationship between the leakage terms and the orientation 
and ellipticity of the antenna responses is derived in Appendix 4.2. The results, 
which in this case require no small-angle approximations, are 


D, = e?* tan Ay, and De = eH tan Axe , (4.40) 


where the A terms are defined by y, = —45° + Ay, and ye = 45° + Aye. 
To derive expressions for the outputs of an interferometer in terms of the leakage 
terms and Stokes visibilities, the four measured correlator outputs are represented by 
R! „, Rye, Ri, and R),. These are related to the corresponding (unprimed) quantities 


rr? 


that would be observed with ideally polarized antennas as follows: 


Rp / (Srm8in) = Ry + DimRer + Dy Rr + D, mD p Ree 
Rie! (Srm8tn) = Rye + DimRee + DinRrr + DymDj,Rer 
Ri, /(8em8e,) = Rer + DimR,r + Dž Ree + DiemD* Re 
Rio/ (Stm8in) = Ree + DemRre + Dj, Rer + DemDi,Rrr - 


(4.41) 
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Now, from the expressions in Table 4.3, the outputs in terms of the Stokes visibilities 
are 


Rph (Srm&m) = D + DrmDr,) — jQ (Drine + Dre") 

— Uy (Dre? — D* e Y”) + Vy — DrmD*) 
Re) (SrmBtn) = Io (Drm + Don) — jQ (CP + DrmDj ec") 

+ Uy (e P — Dm DZ e”) — Vy (Drm — D7.) 
Ri,/ (Etma) = Io (Dem + Dy) — jQ (eP + DimDe e PY") 

— Uy (e?¥™ — DimD* e PY) + Vy (Dim — D*,) 
Rye/ (BemBen) = B + DemD Zn) — ]Qv (Dene PY" + De”) 


+ Us (Dime ?™ — D} eP Y”) — Vy (1 — DemDZ,) - 


(4.42) 


Here again, Ym refers to the polarization (r or £) indicated by the first of the 
two subscripts of the R’ term in the same equation. The angle Ym represents the 
parallactic angle plus any instrumental offset. We have made no approximations in 
deriving Eq. (4.42) [in the similar Eq. (4.37), products of two D terms were omitted]. 
If the leakage terms are small, then any product of two of them can be omitted, 
as in the strongly polarized case for linearly polarized antennas in Eq. (4.37). The 
weakly polarized case is derived from the strongly polarized case by further omitting 
products of Q,, U,, and V, with the leakage terms and is as follows: 


Ryr/(Srm8m) = Ty + Vo 

Rie! (8im8iq) = (Drm + D4) — (Qs — Ue P¥m 
Ri, /(8em8*,) = Ly (Dem + D$) — (JQ, + Ur)e?¥m 
Rie/(SemBin) = b — Vo - 


(4.43) 


Similar expressions? are given by Fomalont and Perley (1989). To make use of the 


expressions that have been derived for the response in terms of the leakage and 
gain factors, we need to consider how such quantities can be calibrated, and this is 
discussed later. 


3In comparing expressions for polarimetry by different authors, note that differences of signs or of 
the factor j can result from differences in the way the parallactic angle is defined with respect to 
the antenna, and similar arbitrary factors. 


134 4 Geometrical Relationships, Polarimetry, and the Measurement Equation 
4.7.4 Matrix Formulation 


The description of polarimetry given above, using the ellipticity and orientation 
of the antenna response, is based on a physical model of the antenna and the 
electromagnetic wave, as in Eq. (4.29). Historically, studies of optical polarization 
have developed over a much longer period. A description of radio polarimetry 
following an approach originally developed in optics is given in Hamaker et al. 
(1996) and in more detail in four papers: Hamaker et al. (1996), Sault et al. (1996), 
Hamaker (2000), and Hamaker (2006). The mathematical analysis is largely in terms 
of matrix algebra, and in particular, it allows the responses of different elements of 
the signal path such as the atmosphere, the antennas, and the electronic system to be 
represented independently and then combined in the final solution. This approach 
is convenient for detailed analysis including effects of the atmosphere, ionosphere, 
etc. 

In the matrix formulation, the electric fields of the polarized wave are represented 
by a two-component column vector. The effect of any linear system on the wave, or 
on the voltage waveforms of the signal after reception, can be represented by a 2 x 2 
matrix of the form shown below: 


H = [e A H (4.44) 
E A az a4 | LE, 

where E, and E, represent the input polarization state (orthogonal linear or opposite 
circular) and E; and E; represent the outputs. The 2 x 2 matrix in Eq. (4.44) is 
referred to as a Jones matrix (Jones 1941), and any simple linear operation on the 
wave can be represented by such a matrix. Jones matrices can represent a rotation of 
the wave relative to the antenna; the response of the antenna, including polarization 
leakage effects; or the amplification of the signals in the receiving system up to 
the correlator input. The combined effect of these operations is represented by the 
product of the corresponding Jones matrices, just as the effect on a scalar voltage can 
be represented by the product of gains and response factors for different stages of 
the receiving system. For a wave specified in terms of opposite circularly polarized 
components, Jones matrices for these operations can take the following forms: 


= _ [exp(j@) 0 
Jrotation = | 0 ae (4.45) 
1 D, 
Jieakage = Íp, 1 | (4.46) 


G, 0 
Jain = i 3 . (4.47) 
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Here, 0 represents a rotation relative to the antenna, and the cross polarization in the 
antenna is represented by the off-diagonal* leakage terms D, and Dy. For a nonideal 
antenna, the diagonal terms will be slightly different from unity, but in this case, the 
difference is subsumed into the gain matrix of the two channels. The gain of both 
the antenna and the electronics can be represented by a single matrix, and since any 
cross coupling of the signals in the amplifiers can be made negligibly small, only 
the diagonal terms are significant in the gain matrix. 

Let Jin represent the product of the Jones matrices required to represent the linear 
operations on the signal of antenna m up to the point where it reaches the correlator 
input. Let J, be the same matrix for antenna n. The signals at the inputs to the 
correlator are J,,E,, and J,E,, where E,, and E, are the vectors representing the 
signals at the antenna. The correlator output is the outer product (also known as the 
Kronecker, or tensor, product) of the signals at the input: 


E', Q E* = (InEm) Q (J*E*) , (4.48) 


where ® represents the outer product. The outer product A ® B is formed by 
replacing each element a of A by a;,B. Thus, the outer product of two nxn matrices 
is a matrix of order n? x n?. It is also a property of the outer product that 


(A;B;) 8 (A,B,) = (A; Q Ax)(B; 8 By) . (4.49) 
Thus, we can write Eq. (4.48) as 


The time average of Eq. (4.50) represents the correlator output, which is 


Riin 
Rm = (E, 8 E*) = - (4.51) 


where p and q indicate opposite polarization states. The column vector in Eq. (4.51) 
is known as the coherency vector and represents the four cross products from 
the correlator outputs for antennas m and n. From Eq. (4.50), it is evident that 
the measured coherency vector R’, which includes the effects of instrumental 
responses, and the true coherency vector Rmn, which is free from such effects, are 
related by the outer product of the Jones matrices that represent the instrumental 


effects: 


Rin = Im D GORnn - (4.52) 


“The diagonal terms are those that move downward from left to right, and the off-diagonal terms 
slope in the opposite direction. 
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To determine the response of an interferometer in terms of the Stokes visibilities of 
the input radiation, which are complex quantities, we introduce the Stokes visibility 
vector 


V smn = (4.53) 


The Stokes visibilities can be regarded as an alternate coordinate system for the 
coherency vector. Let S be a 4 x 4 transformation matrix from Stokes parameters to 
the polarization coordinates of the antennas. Then we have 


R’ = (Jn ® FSV smn . (4.54) 
For ideal antennas with crossed (orthogonal) linear polarization, the response in 


terms of Stokes visibilities is given by the expressions in Table 4.1. We can write 
this result in matrix form as 


Ra 1100F, 

Rey} |00 1j || Qe (4.55) 
Ral vortalu" 

Ry 1-10 0} Ly, 


where the subscripts x and y here refer to polarization position angles 0° and 90°, 
respectively. Similarly for opposite-hand circular polarization, we can write the 
expressions in Table 4.3 as 


Ry 1 0 0 17th 
—jo 2m p7j2Ym 
Mela (Oe ee ee (4.56) 
Re, 0 —je!/ Wm —e! Wm 0 U, 
Ree 1 0 0 -l Vy 


The 4 x 4 matrices in Eqs.(4.55) and (4.56) are transformation matrices from 
Stokes visibilities to the coherency vector for crossed linear and opposite circular 
polarizations, respectively. These 4 x 4 matrices are known as Mueller matrices 
following the terminology established in optics. Note that these matrices depend 
on the particular formulation we have used to specify the angles w and y, and other 
factors in Fig. 4.8, which may not be identical to corresponding parameters used by 
other authors. 


Further explanation of Jones and Mueller matrices can be found in textbooks on optics [e.g., 
O’Neill (1963)]. 
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The expression S~! (Jm ® J*)S is a matrix that relates the input and output 
coherency vectors of a system where these quantities are in Stokes coordinate form. 
As an example of the matrix usage, we can derive the effect of the leakage and gain 
factors in the case of opposite circular polarizations. For antenna m, the Jones matrix 
Jim is the product of the Jones matrices for leakage and gain as follows: 


; 1 D, ? , 
Jn = i; 0 | | rl — | &rm oa ; (4.57) 
0 8lm Den 1 SemDem 8elm 


Here, the g terms represent voltage gain, the D terms represent leakage, and the 
subscripts r and £ indicate polarization. A corresponding matrix J, is required for 
antenna n. Then if we use primes to indicate the components of the coherency vector 
(i.e., the correlator outputs) for antennas m and n, we can write 


Re 0 jem elem 0 Q 

5m | JE 4.58 

R,,, J &® J; 0 —jei?¥m —el2¥m 0 U, ( ) 
u 1 0 0 =i] LV, 


where the 4 x 4 matrix is the one relating Stokes visibilities to the coherency vector 
in Eq. (4.56). Also, we have 


8rm8 rn Erm mnDn BrmB nD rm Brm8 nD rmD, 
Erm8 gn Din &rm8 in 81m8{,DrmD i, Srm8 (nD rm (4.59) 
8¢m8 nD etm 8tm8rnDitmD ry, Stm&in Eem nen 
S68, Dtm D7, Etm8tp, Dim 8tm8 ina Din Sm8 on 


Insertion of Eq. (4.59) into Eq. (4.58) and reduction of the matrix products results 
in Eq. (4.42) for the response with circularly polarized feeds. The use of matrices is 
convenient since they provide a format for expressions representing different effects, 
which can then be combined as required. 


4.7.5 Calibration of Instrumental Polarization 


The fractional polarization of many astronomical sources is of magnitude com- 
parable to that of the leakage and gain terms that are used above to define the 
instrumental polarization. Thus, to obtain an accurate measure of the polarization 
of a source, the leakage and gain terms must be accurately calibrated. It may be 
necessary to determine the calibration independently for each set of observations 
since the gain terms may be functions of the temperature and state of adjustment 
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of the electronics and cannot be assumed to remain constant from one observing 
session to another. Making observations (i.e., measuring the coherency vector) 
of sources for which the polarization parameters are already known is clearly 
a way of determining the leakage and gain terms. The number of unknown 
parameters to be calibrated is proportional to the number of antennas, na, but the 
number of measurements is proportional to the number of baselines, na(na — 1)/2. 
The unknown parameters are therefore usually overdetermined, and a least-mean- 
squares solution may be the best procedure. 

For any antenna with orthogonally polarized receiving channels, there are seven 
degrees of freedom, that is, seven unknown quantities, that must be calibrated to 
allow full interpretation of the measured Stokes visibilities. This applies to the 
general case, and the number can be reduced if approximations are made for 
weak polarization or small instrumental polarization. In terms of the polarization 
ellipses, these unknowns can be regarded as the orientations and ellipticities of 
the two orthogonal feeds and the complex gains (amplitudes and phases) of the 
two receiving channels. When the outputs of two antennas are combined, only 
the differences in the instrumental phases are required, leaving seven degrees of 
freedom per antenna. Sault et al. (1996) make the same point from the consideration 
of the Jones matrix of an antenna, which contains four complex quantities. They 
also give a general result that illustrates the seven degrees of freedom or unknown 
terms. This expresses the relationship between the uncorrected (measured) Stokes 
visibilities (indicated by primes) and the true values of the Stokes visibilities, in 
terms of seven y and 6 terms: 


I-I Y++ Y ô+- —jô—4 Iy 
ge 1 i —j 
Q, Q| __1| » Pee dpe Qs (4.60) 
U, z Uv 2 ô —ő t+ Y Jy. U, 
Vi — Vo =ji-4 —jô-- jy Y+ Vy 


The seven y and 6 terms are defined as follows: 


Y++ = (Agim + Agym) + (Age, + Assn) 

V+- = (Agim — A8ym) + (A8in — Agm) 

Y-— = (Agim — A8ym) — (Agin — Ayn) 

844 = (Dim + Dym) + (D*, + D*,) (4.61) 
84— = (Dım — Dym) + (Din — Dyn) 

-+ = (Dim + Dym) — (Din + Dyn) 

8- = (Dym — Dym) — (D*, — D*,) . 


xn yn 


Here, it is assumed that Eqs. (4.36) are normalized so that the gain terms are close 
to unity, and the Ag terms are defined by gx = 1 + Agir. The D (leakage) terms 
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and the Ag terms are often small enough that products of two such terms can be 
neglected. The results, as shown in Eqs. (4.60) and (4.61), apply to antennas that 
are linearly polarized in directions x and y. The same results apply to circularly 
polarized antennas if the subscripts x and y are replaced by r and £, respectively, and, 
in the column matrices on the left and right sides of Eq. (4.60), terms in Q,, U,, and 
V, are replaced by corresponding terms in Vy, Qv, and Uy, respectively. A similar 
result is given by Sault et al. (1991). The seven y and 6 terms defined above are 
subject to errors in the calibration process, so there are seven degrees of freedom in 
the error mechanisms. 

An observation of a single calibration source for which the four Stokes parame- 
ters are known enables four of the degrees of freedom to be determined. However, 
because of the relationships of the quantities involved, it takes at least three 
calibration observations to solve for all seven unknown parameters (Sault et al. 
1996). In the calibration observations, it is useful to observe one unpolarized 
source, but observing a second unpolarized one would add no further solutions. 
At least one observation of a linearly polarized source is required to determine 
the relative phases of the two oppositely polarized channels, that is, the relative 
phases of the complex gain terms 8m8 and Sym&ey. OF rm8, ANd Semg>,. Note 
that with antennas on altazimuth mounts, observations of a calibrator with linear 
polarization, taken at intervals between which large rotations of the parallactic angle 
occur, can essentially be regarded as observations of independent calibrators. Under 
these circumstances, three observations of the same calibrator will suffice for the 
full solution. Furthermore, the polarization of the calibrator need not be known in 
advance but can be determined from the observations. 

In cases in which only an unpolarized calibrator can be observed, it may be 
possible to estimate two more degrees of freedom by introducing the constraint 
that the sum of the leakage factors over all antennas should be small. As shown 
by the expressions for the leakage terms in Appendix 4.2, this is a reasonable 
assumption for a homogeneous array, that is, one in which the antennas are of 
nominally identical design. However, the phase difference between the signal paths 
from the feeds to the correlator for the two orthogonal polarizations of each antenna 
remains unknown. This requires an observation of a calibrator with a component 
of linear polarization, or a scheme to measure the instrumental component of the 
phase. For example, on the compact array of the Australia Telescope (Frater and 
Brooks 1992), noise sources are provided at each antenna to inject a common signal 
into the two polarization channels (Sault et al. 1996). With such a system, it is 
necessary to provide an additional correlator for each antenna, or to be able to 
rearrange correlator inputs, to measure the relative phase of the injected signals in 
the two polarizations. 

In the case of the approximations for weak polarization, Eqs. (4.38) and (4.43) 
show that if the gain terms are known, the leakage terms can be calibrated by 
observing an unpolarized source. For opposite circular polarizations, Eq. (4.43) 
shows that if V, is small, it is possible to obtain solutions for the gain terms from 
the outputs for the ££ and rr combinations only, provided also that the number of 
baselines is several times larger than the number of antennas. The leakage terms 
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can then be solved for separately. For crossed linear polarizations, Eq. (4.38) shows 
that this is possible only if the linear polarization (Q, and U, parameters) for the 
calibrator have been determined independently. 

Optimum strategies for calibration of polarization observations is a subject 
that leads to highly detailed discussions involving the characteristics of particular 
synthesis arrays, the hour angle range of the observations, the availability of 
calibration sources (which can depend on the observing frequency), and other 
factors, especially if the solutions for strong polarization are used. Such discussions 
can be found, for example, in Conway and Kronberg (1969), Weiler (1973), Bignell 
(1982), Sault et al. (1991), Sault et al. (1996), and Smegal et al. (1997). Polarization 
measurements with VLBI involve some special considerations: see, for example, 
Roberts et al. (1991), Cotton (1993), Roberts et al. (1994), and Kemball et al. (1995). 

For most large synthesis arrays, effective calibration techniques have been 
devised and the software to implement them has been developed. Thus, a prospective 
observer need not be discouraged if the necessary calibration procedures appear 
complicated. Some general considerations relevant to observations of polarization 
are given below. 


e Since the polarization of many sources varies on a timescale of months, it is 
usually advisable to regard the polarization of the calibration source as one of the 
variables to be solved for. 

e Two sources with relatively strong linear polarization at position angles that do 
not appear to vary are 3C286 and 3C138. These are useful for checking the phase 
difference for oppositely polarized channels. 

e For most sources, the circular polarization parameter V, is very small, ~ 0.2% 
or less, and can be neglected. Measurements with circularly polarized antennas 
of the same sense therefore generally give an accurate measure of I. However, 
circular polarization is important in the measurement of magnetic fields by 
Zeeman splitting. As an example of positive detection at a very low level, Fiebig 
and Güsten (1989) describe measurements for which V/I ~ 5 x 107°. Zeeman 
splitting of several components of the OH line at 22.235 GHz was observed using 
a single antenna, the 100-m paraboloid of the Max Planck Institute for Radio 
Astronomy, with a receiving system that switched between opposite circular 
polarizations at 10 Hz. Rotation of the feed and receiver unit was used to identify 
spurious instrumental responses to linearly polarized radiation, and calibration of 
the relative pointing of the two beams to 1” accuracy was required. 

e Although the polarized emission from most sources is small compared with the 
total emission, it is possible for Stokes visibilities Q,, and U, to be comparable 
to J, in cases in which there is a broad unpolarized component that is highly 
resolved and a narrower polarized component that is not resolved. In such 
cases, errors may occur if the approximations for weak polarization [Eqs. (4.38) 
and (4.43)] are used in the data analysis. 

e For most antennas, the instrumental polarization varies over the main beam and 
increases toward the beam edges. Sidelobes that are cross polarized relative to the 
main beam tend to peak near the beam edges. Thus, polarization measurements 
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are usually made for cases in which the source is small compared with the width 
of the main beam, and for such measurements, the beam should be centered on 
the source. 

e Faraday rotation of the plane of polarization of incoming radiation occurs in 
the ionosphere and becomes important for frequencies below a few gigahertz; 
see Table 14.1. During polarization measurements, periodic observations of a 
strongly polarized source are useful for monitoring changes in the rotation, 
which varies with the total column density of electrons in the ionosphere. If not 
accounted for, Faraday rotation can cause errors in calibration; see, for example, 
Sakurai and Spangler (1994). 

¢ In some antennas, the feed is displaced from the axis of the main reflector, for 
example, when the Cassegrain focus is used and the feeds for different bands are 
located in a circle around the vertex. For circularly polarized feeds, this departure 
from circular symmetry results in pointing offsets of the beams for the two 
opposite hands. The pointing directions of the two beams are typically separated 
by ~ 0.1 beamwidths, which makes measurements of circular polarization 
difficult because V, is proportional to (R,, — Ree). For linearly polarized feeds, 
the corresponding effect is an increase in the cross-polarized sidelobes near the 
beam edges. 

e In VLBI, the large distances between antennas result in different parallactic 
angles at different sites, which must be taken into account. 

e The quantities me and m,, of Eqs. (4.20) and (4.22), have Rice distributions of 
the form of Eq. (6.63a), and the position angle has a distribution of the form of 
Eq. (6.63b). The percentage polarization can be overestimated, and a correction 
should be applied (Wardle and Kronberg 1974). 


The following points concern choices in designing an array for polarization 
measurements. 


e The rotation of an antenna on an altazimuth mount, relative to the sky, can 
sometimes be used to advantage in polarimetry. However, the rotation could be 
a disadvantage in cases in which polarization imaging over a large part of the 
antenna beam is being attempted. Correction for the variation of instrumental 
polarization over the beam may be more complicated if the beam rotates on the 
sky. 

e With linearly polarized antennas, errors in calibration are likely to cause 7, 
to corrupt the linear parameters Q, and U,, so for measurements of linear 
polarization, circularly polarized antennas offer an advantage. Similarly, with 
circularly polarized antennas, calibration errors are likely to cause I, to corrupt 
V,,, so for measurements of circular polarization, linearly polarized antennas may 
be preferred. 

e Linearly polarized feeds for reflector antennas can be made with relative 
bandwidths of at least 2: 1, whereas for circularly polarized feeds, the maximum 
relative bandwidth is commonly about 1.4:1. In many designs of circularly 
polarized feeds, orthogonal linear components of the field are combined with 
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+90° relative phase shifts, and the phase-shifting element limits the bandwidth. 
For this reason, linear polarization is sometimes the choice for synthesis arrays 
[see, e.g., James (1992)], and with careful calibration, good polarization perfor- 
mance is obtainable. 

e The stability of the instrumental polarization, which greatly facilitates accurate 
calibration over a wide range of hour angle, is perhaps the most important feature 
to be desired. Caution should therefore be used if feeds are rotated relative to the 
main reflector or if antennas are used near the high end of their frequency range. 


4.8 The Interferometer Measurement Equation 


The set of equations for the visibility values that would be measured for a 
given brightness distribution—taking account of all details of the locations and 
characteristics of the individual antennas, the path of the incoming radiation through 
the Earth’s atmosphere including the ionosphere, the atmospheric transmission, 
etc.—is commonly referred to as the measurement equation or the interferometer 
measurement equation. For any specified brightness distribution and any system 
of antennas, the measurement equation provides accurate values of the visibility 
that would be observed. The reverse operation, i.e., the calculation of the optimum 
estimate of the brightness distribution from the measured visibility values, is more 
complicated. Taking the Fourier transform of the observed visibility function usually 
produces a brightness function with physically distorted features such as negative 
brightness values in some places. However, starting with a physically realistic 
model for the brightness, the measurement equation can accurately provide the 
corresponding visibility values that would be observed. This provides a basis for 
derivation of realistic brightness distributions that represent the observed visibilities, 
using an iterative procedure. 

The formulation of the interferometer measurement equation is based on the anal- 
ysis of Hamaker et al. (1996) and further developed by Rau et al. (2009), Smirnov 
(201 1a,b,c,d), and others. It traces the variations of the signals from a source to the 
output of the receiving system. Direction-dependent effects include the direction of 
propagation of the signals, the primary beams of the antennas, polarization effects 
that vary with the alignment of the polarization of the source relative to that of 
the antennas, and also the effects of the ionosphere and troposphere. Direction- 
independent effects include the gains of the signal paths from the outputs of the 
antennas to the correlator. It is necessary to take account of all these various effects 
to calculate accurately the visibility values corresponding to the source model. 
Several of these effects are dependent upon the types of the interferometer antennas 
and the observing frequencies, so the details of the measurement equation are to 
some extent specific to each particular instrument to which it is applied. 
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The variations in the signal characteristics can generally be expressed as 
the effects of Faraday rotation, parallactic rotation, tilting of the wavefront by 
propagation effects, and variations in feed responses. These are linear effects on 
the signal and, as noted in Sect. 4.7.4, each of them can be represented by a 2 x 2 
(Jones) matrix. Their effect on the signal matrix is given by a series of outer products 
as explained with respect to Eq. (4.48). If the original signal is represented by the 
vector J and the series of effects along the signal path by Jones matrices J; to Ja for 
antenna p and J, to Jm for antenna q, then the voltage at the correlator output from 
the pair of antennas m and n is represented by 


V= Tonk: : IIa Ii dea) Ne: > (4.62) 


where the superscript H indicates the Hermitian (complex) conjugate. Each of the J, 
terms represents a 2 x 2 (Jones) matrix. This analysis is from Smirnov (201 1a,b,c,d). 
The combination of the various corrections into a single equation is helpful in 
ensuring that no significant effects have been overlooked. 

An alternative formulation takes each product Jpn ® JE bn which results in a 4 x 4 
(Mueller) matrix for each of the effects to be corrected along the signal path. If the 
resulting matrices are represented by [Jp, 8J Bae where n indicates the physical order 
in which the effects are encountered in the propagation path, then the correction for 


the effects is obtained as a series of products: 


where S is a Fourier transform matrix that converts the Stokes visibility to 
brightness. Each of the J, ® J terms represents a 4 x 4 matrix. This is basically 
the form used by Rau et al. (2009). The details of the interferometer equation will 
vary for different instruments, depending upon which factors need to be included. 
Here, the intention is to give a general outline of how the calibration factors can be 
applied. Further details can be found in papers by Hamaker et al. (1996), Hamaker 
(2000), Rau et al. (2009), and Smirnov (201 1a,b,c,d). 


4.8.1 Multibaseline Formulation 


In this chapter thus far, we have mainly considered the response of a single pair 
of antennas. The data gathered from a multielement array can conveniently be 
expressed in the form of a covariance matrix. The discussion here largely follows 
Leshem et al. (2000) and Boonstra and van der Veen (2003). We start from the 
expression for the two-element interferometer response and, for simplicity, consider 
the small-angle case in which the w component can be omitted, as in Eq. (3.9), 


ae An(L, Ax(l, m)I(L, m) m) oie (ul-+um) 
ee f LDN omenaa. aeo 


144 4 Geometrical Relationships, Polarimetry, and the Measurement Equation 


Here, V is the complex visibility, and u and v represent the projected baseline 
coordinates measured in wavelengths in a plane normal to the phase reference 
direction. We make four adjustments to the equation. (1) We assume that both the 
astronomical brightness function and the visibility function can each be represented 
by a point-source model with a number of points p. For a point k, the direction is 
specified by direction cosines (l, m,). We replace the integrals in Eq. (4.64) with 
summations over the points. (2) We replace Ay by the product of the corresponding 
complex voltage gain factors g;(l, m) g (l,m), where i and j indicate antennas. 
Constants representing conversion of aperture to gain, etc., can be ignored since, 
in practice, the intensity scale is determined by calibration. (3) We allow the factor 
~ 1 — PÈ — m?) to be subsumed within the intensity function I(l, m). (4) For each 
antenna, we specify the components in the (u, v) plane relative to a reference point 
that can be chosen, for example, to be the center of the array. The (u, v) values for 
a pair of antennas i and j then become (u; — uj, vi — vj). The second and fourth 
modifications allow the parameters involved to be specified in terms of individual 
antennas rather than antenna pairs. Equation (4.64) can now be written as: 


p 
Vu; — uj, vi = vj) = > i gill. my) T (lk, mk) eĵ27(ujlk+ vim) , 


k=1 
(4.65) 


where I, = I(l, mg). Note that u and v do not vary with the source positions within 
the field of view but are defined for the phase reference position (field center). 
Equations (4.64) and (4.65) represent the visibility as measured by a single pair 
of antennas. 

It is useful to put Eq. (4.65) in matrix form. For an array of n antennas, we 
define an n x p matrix containing terms corresponding to the first antenna gain and 
exponential terms of Eq. (4.65) (i.e., the terms associated with antenna i): 


A = 
g(h , ea gi(b, mje“? h+ vim) a. gı (lp, Mp e227 il + vim) 
—j2m (uzl ++ v2m)1) 
g2(l1,m)e/ 


gn (Ly, my Je 22" ents vam) Ea oa. 8n (lp, Mp em P7 pH onm) 


(4.66) 


The antenna index increases downward across the n rows, and the point-source index 
increases toward the right across the p columns. 
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To generate the covariance matrix, we first define a p x p diagonal matrix 
containing the intensity values of the p source-model points: 


B= . (4.67) 


Then we can write 
R = ABA” , (4.68) 


where the superscript H indicates the Hermitian transpose (transposition of the 
matrix plus complex conjugation). R is the covariance matrix, which is Hermitian 
with dimensions nxn. Each element of R is of the form of the right side of Eq. (4.65), 
that is, the sum of responses to the p intensity points for a specific pair of antennas. 
For row i and column j, the element is r;j, which is equal to the right side of 
Eq. (4.65). The elements r;; represent the cross-correlation of signals from antennas 
i and j. When the gain factors g are equal to unity, the elements represent the source 
visibility V. The diagonal elements are the n self-products (i = j), which represent 
the total power responses of the antennas. Note that R is Hermitian: r;; = rie 
R contains the full set of correlator output terms for an array of n antennas for a 
single averaging period and a single frequency channel. These data, when calibrated 
as visibility, can provide a snapshot image. In cases in which the w component is 
important, a term of the form w(v 1 — P — m? — 1) [as in Eq. (3.7)] with appropriate 
subscripts, can be included within each exponent. If the response patterns of the 
antennas are identical, i.e., g; = gj for all (i, j), then gig* = | g|?, and this (real) gain 
factor can be taken outside the matrix R. Thus, to determine the angle of incidence 
(J, m) of a signal from the covariance measurements [the (u, v) values being known], 
the gain factors need not be known if they are identical from one antenna to another 
but otherwise must be known. 

The covariance matrix can also be formulated in terms of the complex signal 
voltages from the antennas of an array. Let the signal from antenna k be xg, which is 
a function of time. For the array, the signals can be represented by a (column) vector 
x of dimensions n x 1, each term of which corresponds to the sum of the terms in 
the corresponding row of the matrix in Eq. (4.66). The outer (or Kronecker) product 
x @ x” leads to a covariance matrix: 


xX] AXP XIX... XIX 
X2xF 


Ral Oleg) ° . GI (4.69) 


x 
Sa XnXi vee e XnXq 
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The elements r;j of the matrix R represent the correlator outputs, which involve a 
time average of the signal products. If the signal products in the elements of R’ are 
similarly understood to represent time-averaged products, then R’ is equivalent to 
the covariance matrix R. 

An example of the application of matrix formulation in radio astronomy is 
provided by the discussion of gain calibration by Boonstra and van der Veen (2003). 
Also, the eigenvectors of the matrix can be used to identify interfering signals that 
are strong enough to be distinguished in the presence of the noise. Such signals can 
then be removed from the data, as discussed, for example, by Leshem et al. (2000). 


Appendix 4.1 Hour Angle—Declination and 
Elevation—Azimuth Relationships 


Although the positions of cosmic sources are almost always specified in celestial 
coordinates, for purposes of observation, it is generally necessary to convert to 
elevation and azimuth. The conversion formulas between hour angle and declination 
(H, 65) and elevation and azimuth (6, A) can be derived by applying the sine and 
cosine rules for spherical triangles to the system in Fig.4.3. For an observer at 
latitude £, they are, for (H, 5) to (A, 8), 
sin & = sinLsind + cos £ cos ô cos H 
cos Ecos A = cos L sin ô — sin £ cos ô cos H (A4.1) 


cos& sin. A = — cos ô sin H , 
Similarly, for (A, &) to (H, ô), 


sinô = sin L sin E + cos £ cos Ecos A 
cos ô cos H = cos L sin & — sin £ cos E cos A (A4.2) 


cos ô sin H = —cos&sinA . 


Here, azimuth is measured from north through east. 


Appendix 4.2 Leakage Parameters in Terms of the 
Polarization Ellipse 


The polarization leakage terms used to express the instrumental polarization are 
related to the ellipticity and orientation of the polarization ellipses of each antenna, 
as shown below. 
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A4.2.1 Linear Polarization 


Consider the antenna in Fig. 4.8, and suppose that it is nominally linearly polarized 
in the x direction, in which case y and w are small angles that represent engineering 
tolerances. A field E aligned with the x axis in Fig.4.8a produces components Ey 
and Ey along the (x’, y’) axes with which the dipoles in Fig. 4.8b are aligned. Then 
from Eq. (4.26), we obtain the voltage at the output of the antenna (point A in 
Fig. 4.8b), which is 


Vi = E(cosy cosy +jsiny sin y) . (A4.3) 
The response to the same field, but aligned with the y axis, is 
V; = E(sin y cos x — j cos y sin x) . (A4.4) 


V’. represents the wanted response to the field along the x axis, and V’ represents the 
unwanted response to a cross-polarized field. The leakage term is equal to the cross- 
polarized response expressed as a fraction of the wanted x-polarization response, 
that is, 


Vi (sin Wy cos xx — j COS yy Sin 7x) ; 
Dy = SS oe hh ite (A4.5) 
V! (cos Wx cos Xx + j sin Yy sin 7x) 


where the subscript x indicates the x-polarization case. The corresponding term 
Dy, for the condition in which Fig. 4.8 represents the nominal y polarization of the 
antenna, is obtained as V!/ v by inverting Eq. (A4.5), replacing Yx by Yy + 7/2, 
and replacing x, by xy. Then yy is measured from the y axis in the same sense as 
Wy is measured from the x axis, that is, increasing in a counterclockwise direction 
in Fig. 4.8. Thus, we obtain 


D= vi _ [cos (Wy + 2/2) cos xy + jsin (Wy + 2/2) sin xy] 
* V [sin (yy + 1/2) cos xy — cos (Wy + 1/2) sin xy] 
(— sin Wy cos xy + j cos Wy sin yy) i 
= Ow y jy- (A4.6) 
(cos Wy cos xy + j sin Wy sin Zy) 
Similar expressions for D, and D, have also been derived by Sault et al. (1991). 
Note that D, and D, are of comparable magnitude and opposite sign, so one would 
expect the average of all the D terms for an array of antennas to be very small. As 
used earlier in this chapter, subscripts m and n are added to the D terms to indicate 
individual antennas. 
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A4.2.2 Circular Polarization 


To receive right circular polarization from the sky, the antenna in Fig.4.8b must 
respond to a field with counterclockwise rotation in the plane of the diagram, 
as explained earlier. This requires y = —45°. In terms of fields in the x and y 
directions, counterclockwise rotation requires that £, leads Ey in phase by 2/2; that 
is, Ex = jE, for the fields as defined in Eq. (4.25). For fields £, and Ey, we determine 
the components in the x’ and y’ directions and then obtain expressions for the output 
of the antenna for both counterclockwise and clockwise rotation of the incident field. 
For counterclockwise rotation: 


E! = E, cosy + Ey sin Yy = E,(cosy —jsiny) , (A4.7) 
E, = —E, sin Y + E, cos Y = —E,(sin y + jcos y) . (A4.8) 
For nominal right-circular polarization, y, = —1/4+ Ay,, where Ay, is a measure 


of the departure of the polarization from circularity. Then from Eq. (4.26), we obtain 
V! = Ese 7” (cos x, — sin %,) = V2E,e”" cos Ay, . (A4.9) 


The next step is to repeat the procedure for left circular polarization from the sky, 
for which we have clockwise rotation of the electric vector and £, = jE,. The result 
is 


Vi = Epei" (cos x, + sin y,) = V2E,e!™" sin Ay, . (A4.10) 


The relative magnitude of the opposite-hand response of the nominally right-handed 
polarization state, that is, the leakage term, is 


Vv! F : 
D, = i = eh tan Ay, ~ ePY Ay, . (A4.11) 


r 


For nominal left-handed polarization, the relative magnitude of the opposite-hand 
response is obtained by inverting the right side of Eq. (A4.11) and also substituting 
Axe + 2/2 for Ay, and we — 2/2 for w,. For the corresponding leakage term De, 
which represents the right circular leakage of the nominally left circularly polarized 
antenna, we then obtain 


Di =e PY tan Aye x eo PM Ay. (A4.12) 
Since —1/4 < y < 1/4, Ax, and Ax, take opposite signs. Thus, as in the case of 


the leakage terms for linear polarization, D, and Dy are of comparable magnitude 
and opposite sign. 
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Chapter 5 
Antennas and Arrays 


This chapter opens with a brief review of some basic considerations of antennas. 
The main part of the chapter is concerned with the configurations of antennas in 
interferometers and synthesis arrays. It is convenient to classify array designs as 
follows: 


1. Arrays with nontracking antennas 
2. Interferometers and arrays with antennas that track the sidereal motion of a 
source: 


e Linear arrays 

e Arrays with open-ended arms (crosses, T-shaped arrays, and Y-shaped arrays) 
e Arrays with closed configurations (circles, ellipses, and Reuleaux triangles) 

e VLBI arrays 

e Planar arrays. 


Examples of these types of arrays are described, and their spatial transfer functions 
(i.e., spatial sensitivities) are compared. Other concerns include the size and number 
of antennas needed in an array. Also discussed is the technique of forming images 
from direct Fourier transformation of the electric field on an aperture. 


5.1 Antennas 


The subject of antennas is well covered in numerous books; see Further Reading at 
the end of this chapter. Baars (2007) gives an informative review of parabolic anten- 
nas, including details of testing and surface adjustment. Here, we are concerned with 
the special requirements of antennas for radio astronomy. As discussed in Chap. 1, 
early radio astronomy antennas operated mainly at meter wavelengths and often 
consisted of arrays of dipoles or parabolic-cylinder reflectors. These had large areas, 
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but the operating wavelengths were long enough that beamwidths were usually of 
order 1° or more. For detection and cataloging of sources, satisfactory observations 
could be obtained during the passage of a source through a stationary beam or 
interferometer fringe pattern. Thus, it was not always necessary for such antennas 
to track the sidereal motion of a source. More recent meter-wavelength systems use 
dipole arrays with computer-controlled phasing to provide tracking beams [see, e.g., 
Koles et al. (1994) and Lonsdale et al. (2009)]. For higher frequencies, synthesis 
arrays use tracking antennas that incorporate equatorial or altazimuth mounts. 

The requirement for high sensitivity and angular resolution has resulted in the 
development of large arrays of antennas. Such instruments are usually designed to 
cover a range of frequencies. For centimeter-wavelength instruments, the coverage 
typically includes bands extending from a few hundred megahertz to some tens 
of gigahertz. For such frequency ranges, the antennas are most often parabolic or 
similar-type reflectors, with separate feeds for the different frequency bands. In 
addition to wide frequency coverage, another advantage of the parabolic reflector 
is that all of the power collected is brought, essentially without loss, to a single 
focus, which allows full advantage to be taken of low-loss feeds and cryogenically 
cooled input stages to provide the maximum sensitivity. 

Figure 5.1 shows several focal arrangements for parabolic antennas, of which the 
Cassegrain is perhaps the most often used. The Cassegrain focus offers a number 
of advantages. A convex hyperbolic reflector intercepts the radiation just before it 
reaches the prime focus and directs it to the Cassegrain focus near the vertex of the 
main reflector. Sidelobes resulting from spillover of the beam of the feed around 
the edges of the subreflector point toward the sky, for which the noise temperature 
is generally low. With a prime-focus feed, the sidelobes resulting from spillover 
around the main reflector point toward the ground and thus result in a higher level 
of unwanted noise pickup. The Cassegrain focus also has the advantage that in all 
but the smallest antennas, an enclosure can be provided behind the main reflector to 


(a) (c) (d) 


Fig. 5.1 Focus arrangements of reflector antennas: (a) prime focus; (b) Cassegrain focus; (c) 
Naysmith focus; (d) offset Cassegrain. With the Naysmith focus, the feed horn is mounted on 
the alidade structure below the elevation axis (indicated by the dashed line), and for a linearly 
polarized signal, the angle of polarization relative to the feed varies with the elevation angle. In 
some other arrangements, for example, beam-waveguide antennas (not shown), there are several 
reflectors, including one on the azimuth axis, which allows the feed horn to remain fixed relative 
to the ground. The polarization then rotates relative to the feed for both azimuth and elevation 
motions. 
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accommodate the low-noise input stages of the electronics. However, the aperture 
of the feed for a prime-focus location is less than that for a feed at the Cassegrain 
focus, and as a result, the feeds for the longer wavelengths are often at the prime 
focus. 

The Cassegrain design also allows the possibility of improving the aperture effi- 
ciency by shaping the two reflectors of the antenna (Williams 1965). The principle 
involved can best be envisioned by considering the antenna in transmission. With 
a conventional hyperboloid subreflector and parabolic main reflector, the radiation 
from the feed is concentrated toward the center of the antenna aperture, whereas 
for maximum efficiency, the electric field should be uniformly distributed. If the 
profile of the subreflector is slightly adjusted, more power can be directed toward the 
outer part of the main reflector, thus improving the uniformity. The main reflector 
must then be shaped to depart slightly from the parabolic profile to regain uniform 
phase across the wavefront after it leaves the main reflector. This type of shaping 
is used, for example, in the antennas of the VLA in New Mexico, for which the 
main reflector is 25 m in diameter. For the VLA, the rms difference between the 
reflector surfaces and the best fit paraboloid is ~ 1 cm, so the antennas can be used 
with prime-focus feeds for wavelengths longer than ~ 16 cm. Shaping is not always 
to be preferred since it introduces some restriction in off-axis performance, which 
is detrimental for multibeam applications. Multiple beams for a large parabolic 
antenna can greatly increase sky coverage, which is particularly useful for survey 
observations. A beamformer feed system in which beams are formed using phased 
arrays of feed elements is described by Elmer et al. (2012), who consider various 
designs (see discussion in Sect. 5.7.2.1). 

For tracking parabolic reflectors, there are numerous differences in the detailed 
design. For example, when a number of feeds for different frequency bands are 
required at the Cassegrain focus, these are sometimes mounted on a turntable 
structure, and the feed that is in use is brought to a position on the axis of the 
main reflector. Alternately, the feeds may be in fixed positions on a circle centered 
on the vertex, and by using a rotatable subreflector of slightly asymmetric design, 
the incoming radiation can be focused onto the required feed. 

Parabolic reflector antennas with asymmetrical feed geometry can exhibit unde- 
sirable instrumental polarization effects that would largely cancel out in a circularly 
symmetrical antenna. This may occur in an unblocked aperture design, as in 
Fig. 5.1d, or in a design in which a cluster of feeds is used for operation on a number 
of frequency bands, where the feeds are close to, but not exactly on, the axis of the 
paraboloid. With crossed linearly polarized feeds, the asymmetry results in cross- 
polarization sidelobes within the main beam. With opposite circularly polarized 
feeds, the two beams are offset in opposite directions in a plane that is normal to 
the plane containing the axis of symmetry of the reflector and the center of the 
feed. This offset can be a serious problem in measurements of circular polarization, 
since the result is obtained by taking the difference between measurements with 
opposite circularly polarized responses (see Table 4.3). For measurements of linear 
polarization, the offset is less serious since this involves taking the product of two 
opposite-hand outputs, and the resulting response is symmetrical about the axis of 
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the parabola. The effects can be largely canceled by inserting a compensating offset 
in a secondary reflector. For further details, see Chu and Turrin (1973) and Rudge 
and Adatia (1978). 

A basic point concerns the accuracy of the reflector surface. Deviations of the 
surface from the ideal profile result in variations in the phase of the electromagnetic 
field as it reaches the focus. We can think of the reflector surface as consisting of 
many small sections that deviate from the ideal surface by €, a Gaussian random 
variable with probability distribution 


1 
Pe) = Tar =a, (5.1) 


where (e) = 0, (e?) = o°, and ( ) indicates the expectation. A relation of general 
importance in probabilistic calculations is (e/«), which is 


(e) I Oeřde = ——= f ia de = re (5.2) 
e = €)e E= e 204 EeE=e 7 ý 
E 210 J- 


The rightmost integral is accomplished by the method of completing the square in 
the argument of the exponential, i.e., — (= + je) = -z (e + jo?) E L. The 


e~° /2 factor can be moved outside the integral, the rest of which is unity. 

A surface deviation € produces a deviation of approximately 2e in the path length 
of a reflected ray; this approximation improves as the focal ratio is increased. Thus, a 
deviation € causes a phase shift p ~ 47€/A, where A is the wavelength. As a result, 
the electric field components at the focus have a Gaussian phase distribution with 
Op = 40/4. If there are N independent sections of the surface, then the collecting 
area, which is proportional to the square of the electric field, is given by 


a=a 


where Apo is the collecting area for a perfect surface, and it has been assumed that 
N is large enough that terms for which i = k can be ignored. The V2 factor comes 
from differencing two random variables. Then from Eqs. (5.2) and (5.3), we obtain 


2 3 
à — v2 i 
)- “ Sel) ~ Age SE, (5.3) 
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A = Age A | (5.4) 


This equation is known in radio engineering as the Ruze formula (Ruze 1966) 
and in some other branches of astronomy as the Strehl ratio. As an example, if 
o/à = 1/20, the aperture efficiency, A/Ao, is 0.67. In the case of antennas with 
multiple reflecting surfaces, the rms deviations can be combined in the usual root- 
sum-squared manner. Secondary reflectors, such as a Cassegrain subreflector, are 
smaller than the main reflector, and for smaller surfaces, the rms deviation is usually 
correspondingly smaller. The surface adjustment of the 12-m-diameter antennas 
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of the Atacama Large Millimeter/submillimeter Array (ALMA) array, which are 
capable of operation up to ~ 900 GHz, is a good example of the accuracy that can 
be achieved (Mangum et al. 2006). A study of the dynamics of the surface of the 
antennas is described by Snel et al. (2007). 

Several techniques have been developed for improving the performance of 
parabolic antennas. An example is the adjustment of the subreflector shape to 
compensate for errors in the main reflector [see, e.g., Ingalls et al. (1994), Mayer 
et al. (1994)]. Another improvement is in the design of the focal support structure 
to minimize blockage of the aperture and reduce sidelobes in the direction of the 
ground (Lawrence et al. 1994; Welch et al. 1996). A common method of supporting 
equipment near the reflector focus is the use of a tripod or quadrupod structure. If 
the legs of the structure are connected to the edge of the main reflector rather than 
to points within the reflector aperture, they interrupt only the plane wave incident 
on the aperture, not the spherical wavefront between the reflector and the focus. 
Use of an offset-feed reflector avoids any blockage of the incident wavefront in 
reaching the focus. However, both of these methods of reducing blockage increase 
the complexity and cost of the structure. 


5.2 Sampling the Visibility Function 


5.2.1 Sampling Theorem 


The choice of configuration of the antennas of a synthesis array is largely based on 
optimizing the sampling of the visibility function in (u, v) space. Thus, in consid- 
ering array design, it is logical to start by examining the sampling requirements. 
These are governed by the sampling theorem of Fourier transforms (Bracewell 
1958). Consider first the measurement of the one-dimensional intensity distribution 
of a source, /;(/). It is necessary to measure the complex visibility V in the 
corresponding direction on the ground at a series of values of the projected antenna 
spacing. For example, to measure an east—west profile, a possible method is to make 
observations near meridian transit of the source using an east—west baseline and to 
vary the length of the baseline from day to day. 

Figure 5.2a—c illustrates the sampling of the one-dimensional visibility function 
V(u). The sampling operation can be represented as multiplication of V(u) by the 
series of delta functions in Fig. 5.2b, which can be written 


Fa m(—) = Y {(u—iAw), (5.5) 


i=—00 


where the left side is included to show how the series can be expressed in terms 
of the shah function, III, introduced by Bracewell and Roberts (1954). The series 
extends to infinity in both positive and negative directions, and the delta functions 
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Fig. 5.2 Illustration of the sampling theorem: (a) visibility function V (u), real part only; (b) 
sampling function in which the arrows represent delta functions; (c) sampled visibility function; 
(d) intensity function 7; (); (e) replication function; (f) replicated intensity function. Functions in 
(d), (e), and (f) are the Fourier transforms of those in (a), (b), and (c), respectively. (g) is the 
replicated intensity function showing aliasing in the shaded areas resulting from using too large a 
sampling interval. 


are uniformly spaced with an interval Au. The Fourier transform of Eq. (5.5) is the 
series of delta functions shown in Fig. 5.2e: 


foe) 


T(iAu) = ns y 5( = =) (5.6) 


p=—00 


In the / domain, the Fourier transform of the sampled visibility is the convolution 
of the Fourier transform of V(u), which is the one-dimensional intensity function 
1 (J), with Eq. (5.6). The result is the replication of J; (J) at intervals (Au)! shown 
in Fig. 5.2f. If 1; (2) represents a source of finite dimensions, the replications of J (/) 
will not overlap as long as Jı (l) is nonzero only within a range of / that is no greater 
than (Au)~!. Hence, if J, is the range over which J; (/) is nonzero or, more generally, 
the field of view of an observation, then the avoidance of aliasing requires Au < 
1/1,. An example of overlapping replications is shown in Fig.5.2g. The loss of 
information resulting from such overlapping is commonly referred to as aliasing, 
because the components of the function within the overlapping region lose their 
identity with respect to which end of the replicated function they properly belong. 
The distortion in the replicated intensity function is said to be caused by “leakage” 
[see Bracewell (2000)]. 
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The requirement for the restoration of a function from a set of samples, for 
example, deriving the function in Fig. 5.2a from the samples in Fig. 5.2c, is easily 
understood by considering the Fourier transforms in Fig.5.2d and f. Interpolation 
in the u domain corresponds to removing the replications in the / domain, which 
can be achieved by multiplying the function in Fig. 5.2f by the rectangular function 
indicated by the broken line. In the u domain, this multiplication corresponds to 
convolution of the sampled values with the Fourier transform of the rectangular 
function, which is the unit area sinc function, 


sin wu/Au 
—. (5.7) 


TU 


If aliasing is avoided, convolution with (5.7) provides exact interpolation of the 
original function from the samples. Note that perfect restoration requires a sum 
over all samples except when the sinc function is centered on a specific sample. 
Thus, we can state, as the sampling theorem for the visibility, that if the intensity 
distribution is nonzero only within an interval of width lẹ, I\()) is fully specified 
by sampling the visibility function at points spaced Au = I, in u. The interval 
Au = I," is called the critical sampling interval. Sampling at a finer interval in u 
is called oversampling and usually does no harm nor does it provide any benefit. 
Sampling at a coarser interval is called undersampling, which leads to aliasing. 

Aliasing can lead to serious misinterpretation of source structure. For example, 
suppose the intensity function /,(/) consists of a number of compact separated 
components. A component that lies outside the proper sampling window, i.e., 
|| > 1,/2, at negative / will be aliased to a position on the positive side of the 
replicated intensity function. Thus, its appears at the wrong position. This error can 
be discovered by regridding the data at a finer interval Au. An aliased component 
will move in an unexpected way in the image plane. 

The spatial sampling theorem described here is just a formulation of the standard 
Shannon—Nyquist theorem normally written in the time (t)-frequency (v) domain. 
Here, the critical sampling frequency for a temporal waveform of bandwidth Av is 
1/(2Av). The factor of two appears because the spectrum in Fourier space extends 
from —Av to +Av. 

In two dimensions, it is simply necessary to apply the theorem separately to the 
source in the / and m directions. A compact source that is just beyond the sampling 
limit at the lower left of the image will be aliased into the sampling interval in the 
upper right. For further discussion of the sampling theorem, see, for example, Unser 
(2000). 


5.2.2 Discrete Two-Dimensional Fourier Transform 


The derivation of an image (or map) from the visibility measurements is the 
subject of Chap. 10, but it is important at this point to understand the form 
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in which the visibility data are required for this transformation. The discrete 
Fourier transform (DFT) is very widely used in synthesis imaging because of the 
computational advantages of the fast Fourier transform (FFT) algorithm [see, e.g., 
Brigham (1988)]. The basic properties of the DFT in one dimension are described in 
Appendix 8.4. In two dimensions, the functions V(u, v) and I(l, m) are expressed as 
rectangular matrices of sampled values at uniform increments in the two variables 
involved. The rectangular grid points on which the intensity is obtained provide a 
convenient form for further data processing. 

The two-dimensional form of the discrete transform for a Fourier pair f and g is 
defined by 


M-1N-1 


Fe q) = 5 X ali, k) e i2xip/M p-i2rkq/N , (5.8) 


i=0 k=0 
and the inverse is 


M-1N-1 


=> Viggen (5.9) 


p=0 q=0 


The functions are periodic with periods of M samples in the i and p dimensions 
and N samples in the k and q dimensions. Evaluation of Eqs. (5.8) or (5.9) by direct 
computation requires approximately (MN)? complex multiplications. In contrast, if 
M and N are powers of 2, the FFT algorithm requires only MN log, (MN) complex 
multiplications. 

The transformation between V(u, v) and I(l, m), where J is the source intensity 
in two dimensions, is obtained by substituting g(i, k) = I(iAl,kAm) and f(p, q) = 
V(pAu, gAv) in Eqs. (5.8) and (5.9). The relationship between the integral and 
discrete forms of the Fourier transform is found in several texts; see, for example, 
Rabiner and Gold (1975) or Papoulis (1977). The dimensions of the (u, v) plane that 
contain these data are M Au by NAv. In the (l, m) plane, the points are spaced Al in 
land Am in m, and the image dimensions are M Al by N Am. The dimensions in the 
two domains are related by 


Au = (MAD!, Av=(NAm)', 
(5.10) 
Al = (MAW !, Am = (NAv! . 


The spacing between points in one domain is the reciprocal of the total dimension in 
the other domain. Thus, if the size of the array in the intensity domain is chosen to be 
large enough that the intensity function is nonzero only within the area M Alx N Am, 
then the spacings Au and Av in Eq. (5.10) satisfy the sampling theorem. 
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Au 


Fig. 5.3 Points on a rectangular grid in the (u, v) plane at which the visibility is sampled for use 
with the discrete Fourier transform. As shown, the spacings Au and Av are equal. The division of 
the plane into grid cells of size Au x Av is also shown. 


To apply the discrete transform to synthesis imaging, it is necessary to obtain 
values of V(u,v) at points separated by Au in u and by Av in v, as shown in 
Fig. 5.3. However, the measurements are generally not made at (u,v) points on 
a grid since for tracking interferometers, they fall on elliptical loci in the (u, v) 
plane, as explained in Sect. 4.1. Thus, it is necessary to obtain the values at the 
grid points by interpolation or similar processes. In Fig.5.3, the plane is divided 
into cells of size Au x Av centered on the grid points. A very simple method 
of determining a visibility value to assign at each grid point is to take the mean 
of all values that fall within the same cell. This procedure has been termed cell 
averaging (Thompson and Bracewell 1974). Better procedures are generally used; 
see Sect. 10.2.2. However, the cell averaging concept helps one to visualize the 
required distribution of the measurements; ideally there should be at least one 
measurement, or a small number of measurements, within each cell. Thus, the 
baselines should be chosen so that the spacings between the (u,v) loci are no 
greater than the cell size, to maximize the number of cells that are intersected by 
a locus. Cells that contain no measurements result in holes in the (u, v) coverage, 
and minimization of such holes is an important criterion in array design. Lobanov 
(2003) and Lal et al. (2009) discuss the performance of arrays based on uniformity 
of (u, v) coverage (see Sect. 5.4.2). 
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5.3 Introductory Discussion of Arrays 


5.3.1 Phased Arrays and Correlator Arrays 


An array of antennas can be interconnected to operate as a phased array or as a 
correlator array. Figure 5.4a shows a simple schematic diagram of a phased array 
connected to a square-law detector, in which the number of antennas, na, is equal to 
four. If the voltages at the antenna outputs are V;, V2, V3, and so on, the output of 
the square-law detector is proportional to 


(Vi + V2 + V3 +--+ H Vn) . (5.11) 


Note that for na antennas, there are na(na — 1) cross-product terms of form VinVp 
involving different antennas m and n, and na self-product terms of form v2. If the 
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signal path (including the phase shifter) from each antenna to the detector is of 
the same electrical length, the signals combine in phase when the direction of the 
incoming radiation is given by 


6 = sin™! (Z) , (5.12) 
L 


where N is an integer, including zero, and £, is the spacing interval measured in 
wavelengths. The position angles of the maxima, which represent the beam pattern 
of the array, can be varied by adjusting the phase shifters at the antenna outputs. 
Thus, the beam pattern can be controlled and, for example, scanned to form an 
image of an area of sky. 

In correlator arrays, a correlator generates the cross product of the signal voltages 
VmVn for every antenna pair, as in Fig. 5.4b. These outputs take the form of 
fringe patterns and can be combined to produce maxima similar to those of the 
phased array. If a phase shift is introduced at the output of one of the correlator 
array antennas, the result appears as a corresponding change in the phase of 
the fringes measured with the correlator connected to that antenna. Conversely, 
the effect of an antenna phase shift can be simulated by changing the measured 
phases when combining the correlator outputs. Thus, a beam-scanning action can 
be accomplished by combining measured cross-correlations in a computer with 
appropriate variations in the phase. This is what happens in computing the Fourier 
transform of the visibility function, that is, the Fourier transform of the correlator 
outputs as a function of spacing. The loss of the self-product terms reduces the 
instantaneous sensitivity of the correlator array by a factor (na — 1)/na in power, 
which is close to unity if ną is large. However, at any instant, the correlator array 
responds to the whole field of the individual antennas, whereas the response of the 
phased array is determined by the narrow beam that it forms, unless it is equipped 
with a more complex signal-combining network that allows many beams to be 
formed simultaneously. Thus, in imaging, the correlator array gathers data more 
efficiently than the phased array. 

The response pattern of the correlator array to a point source is the same as 
that of the phased array, except for the self-product terms. The response of the 
phased array consists of one or more beams in the direction in which the antenna 
responses combine with equal phase. These are surrounded by sidelobes, the pattern 
and magnitude of which depend on the number and configuration of antennas. 
Between individual sidelobe peaks, there will be nulls that can be as low as zero, 
but the response is positive because the output of the square-law detector cannot go 
negative. Now consider subtracting the self-product terms, to simulate the response 
of the correlator array. Over a field of view small compared with the beamwidth 
of an individual antenna, each self-product term represents a constant level, and 
each cross product represents a fringe oscillation. In the response to a point source, 
all of these terms are of equal magnitude. Subtracting the self-products from the 
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phased-array response causes the zero level to be shifted in the positive direction 
by an amount equal to 1/n, of the peak level, as indicated by the broken line in 
Fig. 5.4c. The points that represent zeros in the phased-array response become the 
peaks of negative sidelobes. Thus, in the response of the correlator array, the positive 
values are decreased by a factor (na — 1)/ng relative to those of the phased array. 
In the negative direction, the response extends to a level of —1/(n, — 1) of the 
positive peak but no further since this level corresponds to the zero level of the 
phased array. Kogan (1999) pointed out this limitation on the magnitude of the 
negative sidelobes of a correlator array and also noted that this limit depends not 
on the configuration of the individual antennas but only on their number. Neither of 
these conclusions applies to the positive sidelobes. This result is strictly true only 
for snapshot observations [i.e., those in which the (u, v) coverage is not significantly 
increased by Earth rotation] and for uniform weighting of the correlator outputs. 
Finally, consider some characteristics of a phased array as in Fig.5.4a. The 
power combiner is a passive network, for example, the branched transmission line in 
Fig. 1.13a. If a correlated waveform of power P is applied to each combiner input, 
then the output power is naP. In terms of the voltage V at each input, a fraction 
1/,/nqa of each voltage combines additively to produce an output of „/na V, or 
NaP in power. Now if the input waveforms are uncorrelated, again each contributes 
V/./Na in voltage but the resulting powers combine additively (i.e., as the sum 
of the squared voltages), so in this case, the power at the output is equal to the 
power P at one input. Each input then contributes only 1/n, of its power to the 
output, and the remaining power is dissipated in the terminating impedances of the 
combiner inputs (i.e., radiated from the antennas if they are directly connected to 
the combiner). The signals from an unresolved source received in the main beam 
of the array are fully correlated, but the noise contributions from amplifiers at the 
antennas are uncorrelated. Thus, if there are no losses in the transmission lines or 
the combiner, the same signal-to-noise ratio at the detector is obtained by inserting 
an amplifier at the output of each antenna, or a single amplifier at the output of the 
combiner. However, such losses are often significant, so generally it is advantageous 
to use amplifiers at the antennas. Note that if half of the antennas in a phased array 
are pointed at a radio source and the others at blank sky, the signal power at the 
combiner output is one-quarter of that with all antennas pointed at the source. 


5.3.2 Spatial Sensitivity and the Spatial Transfer Function 


We now consider the sensitivity of an antenna or array to the spatial frequencies 
on the sky. The angular response pattern of an antenna is the same in reception or 
transmission, and at this point it may be easier to consider the antenna in transmis- 
sion. Then power applied to the terminals produces a field at the antenna aperture. 
A function W(u, v) is equal to the autocorrelation function of the distribution of the 
electric field across the aperture, &(x,, y,). Here x, and y, are coordinates in the 
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aperture plane of the antenna and are measured in wavelengths. Thus, 
W(u, v) = Elx, y,) * * S* (xy, y1) 


[0,0] [0,6] 
= / J Elx, ya) E* (xx — u, ya — V) dxi dya . (5.13) 
—oo J —o0 


The double-pentagram symbol represents two-dimensional autocorrelation. The 
integral in Eq. (5.13) is proportional to the number of ways, suitably weighted by 
the field intensity, in which a specific spacing vector (u, v) can be found within 
the antenna aperture. In reception, W(u, v) is a measure of the sensitivity of the 
antenna to different spatial frequencies. In effect, the antenna or array acts as a 
spatial frequency filter, and W (u, v) is widely referred to as the transfer function by 
analogy with the usage of this term in filter theory. W (u, v) has also been called the 
spectral sensitivity function (Bracewell 1961, 1962), which refers to the spectrum of 
spatial frequencies (not the radio frequencies) to which the array responds. We use 
the terms spatial transfer function and spatial sensitivity when discussing W(u, v). 
The area of the (u, v) plane over which measurements can be made [i.e., the support 
of W(u, v), defined as the closure of the domain within which W(u, v) is nonzero] 
is referred to as the spatial frequency coverage, or the (u, v) coverage. 

Consider the response of the antenna or array to a point source. Since the 
visibility of a point source is constant over the (u, v) plane, the measured spatial 
frequencies are proportional to W(u, v). Thus, the point-source response A(/, m) 
is the Fourier transform of W(u, v). This result is formally derived by Bracewell 
and Roberts (1954). [Recall from the discussion preceding Eq. (2.15) that the point- 
source response is the mirror image of the antenna power pattern: A(/,m) = 
A(—l, —m).] The spatial transfer function W(u, v) is an important feature in this 
chapter, and Fig.5.5 further illustrates its place in the interrelationships between 
functions involved in radio imaging. 

Figure 5.6a shows an interferometer in which the antennas do not track and are 
represented by two rectangular areas. We shall assume that 6(x,, y,) is uniformly 
distributed over the apertures, such as in the case of arrays of uniformly excited 
dipoles. First suppose that the output voltages from the two apertures are summed 
and fed to a power-measuring receiver, as in some early instruments. The three 
rectangular areas in Fig. 5.6b represent the autocorrelation function of the aperture 
distributions, that is, the spatial transfer function. Note that the autocorrelation of the 
two apertures contains the autocorrelation of the individual apertures (the central 
rectangle in Fig.5.6b) plus the cross-correlation of the two apertures (the shaded 
rectangles). If the two antennas are combined using a correlator instead of a receiver 
that responds to the total received power, the spatial sensitivity is represented by 
only the shaded rectangles since the correlator forms only the cross products of 
signals from the two apertures. 

The interpretation of the spatial transfer function as the Fourier transform of the 
point-source response can be applied to both the adding and correlator cases. For 
example, for the correlator implementation of the interferometer in Fig. 5.6a, the 
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Fig. 5.5 Relationships between functions involved in imaging a source. Starting at the top left, 
the autocorrelation of the aperture distribution of the electric field over an antenna &(x,, y,) gives 
the spatial transfer function W(u, v). The measured visibility in the observation of a source is 
the product of the source visibility V(u, v) and the spatial transfer function. At the top right, 
the multiplication of the voltage reception pattern V,(/,m) with its complex conjugate produces 
the power reception pattern A(/, m). Imaging of the source intensity distribution I(l, m) results in 
convolution of this function with the antenna power pattern. The Fourier transform relationships 
between the quantities in the (x4, y1) and (u,v) domains, and those in the (l,m) domain, are 
indicated by the bidirectional arrows. When the spatial sensitivity is built up by Earth rotation, 
as in tracking arrays, it cannot, in general, be described as the autocorrelation function of any field 
distribution. Only the part of the diagram below the broken line applies in such cases. 


response to a point source is the Fourier transform of the function represented by 
the shaded areas. This Fourier transform is 


Š I 2 é 2 
ja] [ee ae, (5.14) 
mxl yam 


where x); and yj; are the aperture dimensions, and D, is the aperture separation, 
all measured in wavelengths. The sinc-squared functions in (5.14) represent the 
power pattern of the uniformly illuminated rectangular apertures, and the cosine 
term represents the fringe pattern. In early instruments, the relative magnitude of 
the spatial sensitivity was controlled only by the field distribution over the antennas, 
but image processing by computer enables the magnitude to be adjusted after an 
observation has been made. 
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Fig. 5.6 The two apertures in (a) represent a two-element interferometer, the spatial transfer 
function of which is shown in (b). The shaded areas contain the spatial sensitivity components 
that result from the cross-correlation of the signals from the two antennas. If the field distribution 
is uniform over the apertures, the magnitude of the spatial sensitivity is linearly tapered. This is 
indicated by c and d, which represent cross sections of the spatial transfer function. 


Some commonly used configurations of antenna arrays, and the boundaries of 
their autocorrelation functions, are shown in Fig. 5.7. The autocorrelation functions 
indicate the instantaneous spatial sensitivity for a continuous aperture in the form 
of the corresponding figure. Equation (5.13) shows that the autocorrelation function 
is the integral of the product of the field distribution with its complex conjugate 
displaced by u and v. By investigating the values of u and v for which the 
two aperture figures overlap, it is easy to determine the boundary within which 
the spatial transfer function is nonzero, using graphical procedures described by 
Bracewell (1961, 1995). It is also possible to identify ridges of high autocorrelation 
that occur for displacements at which the arms of figures such as those in Fig. 5.7a, 
b, or c are aligned. In the case of the ring, Fig.5.7g, the autocorrelation function 
is proportional to the area of overlap at the two points where the ring intersects 
with its displaced replication. This area decreases monotonically for a ring of unit 
diameter until g = Ju? + v? = 1/4/2, where the tangents to the two rings at the 
intersection points are 2/2. For q > 1/2, the autocorrelation function increases 
as the tangents realign. The analytic form of the autocorrelation function, shown 
in Fig. 5.7j, is the Fourier transform of a I Bessel function, which is proportional 
to 1/(q4y 1 — q?), for 0 < q < 1. Another interesting aperture is a filled circle, 
for which the autocorrelation function decreases monotonically from q = 0 to 1 
with the form cos™! (q) — qy 1 — q?, which Bracewell (2000) calls the Chinese 
hat function. When the aperture is not completely filled, that is, when the figure 
represents an array of discrete antennas, the spatial sensitivity takes the form of 
samples of the autocorrelation function. For example, for a cross of uniformly 
spaced antennas, the square in Fig. 5.7b would be represented by a matrix pattern 
within the square boundary. 
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5.3.3. Meter-Wavelength Cross and T-Shaped Arrays 


A cross and its autocorrelation function are shown in Fig. 5.7a and b. It is assumed 
that the width of the arms is finite but small compared with the length of the arms. 
In the case of the Mills cross (Mills 1963) described briefly in Chap. 1, the outputs 
of the two arms go to a single cross-correlating receiver, so the spatial sensitivity is 
represented by the square in Fig. 5.7b. The narrow extensions on the centers of the 
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sides of the square represent parts of the autocorrelation functions of the individual 
arms, which are not formed in the cross-correlation of the arms. However, they 
are formed if the arms consist of lines of individual antennas, for which the cross- 
correlation is formed for pairs on the same arm as well as those on crossed arms. 
The case for a T-shaped array is similar and is shown in Fig.5.7c and d. 

If the sensitivity (i.e., the collecting area per unit length) is uniform along the 
arms for a cross or a corresponding T, then the weighting of the spatial sensitivity is 
uniform over the square (u,v) area; note that it does not taper linearly from the 
center as in the situation in Fig.5.6. At the edge of the square area, the spatial 
sensitivity falls to zero in a distance equal to the width of the arms. Such a sharp 
edge, resulting from the uniform sensitivity, results in strong sidelobes. Therefore, 
an important feature of the Mills cross design was a Gaussian taper of the coupling 
of the elements along the arms to reduce the sensitivity to about 10% at the ends. 
This greatly reduced local maxima in the response resulting from sidelobes outside 
the main beam, at the expense of some broadening of the beam. 

Figure 1.12a shows an implementation of a T-shaped array that is an example 
of a nontracking correlator interferometer. Here, a small antenna is moved in steps, 
with continuous coverage, to simulate a larger aperture; see Blythe (1957), Ryle 
et al. (1959), and Ryle and Hewish (1960). The spatial frequency coverage is the 
same as would be obtained in a single observation with an antenna of aperture equal 
to that simulated by the movement of the small antenna, although the magnitude 
of the spatial sensitivity is not exactly the same. The term aperture synthesis was 
introduced to describe such observations, but to be precise, it is the autocorrelation 
of the aperture that is synthesized (see Sect. 5.4). 


5.4 Spatial Transfer Function of a Tracking Array 


The range of spatial frequencies that contribute to the output of an interferometer 
with tracking antennas is illustrated in Fig. 5.8b. The two shaded areas represent the 
cross-correlation of the two apertures of an east—west interferometer for a source 
on the meridian. As the source moves in hour angle, the changing (u, v) coverage is 
represented by a band centered on the spacing locus of the two antennas. Recall from 
Sect. 4.1 that the locus for an Earth-based interferometer is an arc of an ellipse, and 
that since V(—u, —v) = V* (u, v), any pair of antennas measures visibility along 
two arcs symmetric about the (u, v) origin, both of which are included in the spatial 
transfer function. 

Because the antennas track the source, the antenna beams remain centered on the 
same point in the source under investigation, and the array measures the product of 
the source intensity distribution and the antenna pattern. Another view of this effect 
is obtained by considering the radiation received by small areas of the apertures of 
two antennas, the centers of which are A; and A> in Fig. 5.9. The antenna apertures 
encompass a range of spacings from u — d, to u + d, wavelengths, where d} 
is the antenna diameter measured in wavelengths. If the antenna beams remain 
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(a) 


Fig. 5.8 (a) The aperture of an east-west, two-element interferometer. The corresponding spatial 
frequency coverage for cross-correlated signals is shown by the shaded areas in (b). If the antennas 
track the source, the spacing vector traces out an elliptical locus (the solid line) in the (u, v) plane. 
The area between the broken lines in (b) indicates the spatial frequencies that contribute to the 
measured values. The spacing between the broken lines is determined by the cross-correlation of 
the antenna aperture. 


Fig. 5.9 Illustration of the effect of tracking on the fringe frequency at the correlator output. The 
u component of the baseline is shown, and the v component is omitted since it does not affect the 
fringe frequency. The curved arrow indicates the tracking motion of the antennas. 


fixed in position as a source moves through them, then the correlator output is 
a combination of fringe components with frequencies from w,(u — d,)cos6é to 
@-(u+d),) cos 6, where œe is the angular velocity of the Earth and 6 is the declination 
of the source. To examine the effect when the antennas track the source, consider 
the point B, which, because of the tracking, has a component of motion toward the 
source equal to w,Aucosé wavelengths per second. This causes a corresponding 
Doppler shift in the signal received at B. To obtain the fringe frequency for waves 
arriving at A; and B, we subtract the Doppler shift from the nontracking fringe 
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frequency and obtain [we (u + Au) cos ô] — (we Au cos ô) = (weu cos ô). The fringe 
frequency when tracking is thus the same as for the central points A; and A, of 
the apertures. (This is true for any pair of points; choosing one point at an antenna 
center in the example above slightly simplifies the discussion.) Thus, if the antennas 
track, the contributions from all pairs of points within the apertures appear at the 
same fringe frequency at the correlator output. As a result, such contributions 
cannot be separated by Fourier analysis of the correlator output waveform, and 
information on how the visibility varies over the range u — d) to u + dj, is lost. 
However, if the antenna motion differs from a purely tracking one, the information 
is, in principle, recoverable. In imaging sources wider than the antenna beams, an 
additional scanning motion to cover the source is added to the tracking motion. In 
effect, this scanning allows the visibility to be sampled at intervals in u and v that 
are fine enough for the extended width of the source. This technique, known as 
mosaicking, is described in Sect. 11.5. 

To accommodate the effects that result when the antennas track the source 
position, the normalized antenna pattern is treated as a modification to the intensity 
distribution, which then becomes Ay(/,m)I(J,m). The spatial transfer function 
W(u, v) for a pair of tracking antennas is represented at any instant by a pair of 
two-dimensional delta functions 76(u, v) and *6(—u, —v). For an array of antennas, 
the resulting spatial transfer function is represented by a series of delta functions 
weighted in proportion to the magnitude of the instrumental response. As the Earth 
rotates, these delta functions generate the ensemble of elliptical spacing loci. The 
loci represent the spatial transfer function of a tracking array. 

Consider observation of a source /(/,m), for which the visibility function is 
V(u, v), with normalized antenna patterns Ay(/,m). Then if W(u, v) is the spatial 
transfer function, the measured visibility is 


[V(u, v) x *Ay(u, v)| Wu, v) , (5.15) 


where the double asterisk indicates two-dimensional convolution and the bar 
denotes the Fourier transform. The Fourier transform of (5.15) gives the measured 
intensity: 


[I(l, m)Ay(I, m)] * * W(L, m) . (5.16) 


If we observe a point source at the (l, m) origin, where Ay = 1, expression (5.16) 
becomes the point-source response bo(l, m). We then obtain 


bo(l, m) = [75(1, m)An(I,m)] * * W(l,m) = W(I,m) , (5.17) 


where the two-dimensional delta function, 76(/,m), represents the point source. 
Here again, the point-source response is the Fourier transform of the spatial 
transfer function. In the tracking case, the spatial frequencies that contribute to the 
measurement are represented by W(u, v) * *Ay(u, v). Note that Ay(u, v) is twice as 
wide as the corresponding antenna aperture in the (x, y) domain. 
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The term aperture synthesis is sometimes extended to include observations 
that involve hour-angle tracking. However, it is not possible to define an exactly 
equivalent antenna aperture for a tracking array. For example, consider the case of 
two antennas with an east-west baseline tracking a source for a period of 12 h. 
The spatial transfer function is an ellipse centered on the origin of the (u, v) plane, 
with zero sensitivity within the ellipse (except for a point at the origin that could be 
supplied by a measurement of total power received in the antennas). The equivalent 
aperture would be a function, the autocorrelation of which is the same elliptical ring 
as the spatial transfer function. No such aperture function exists, and thus the term 
“aperture synthesis” can only loosely be applied to describe most observations that 
include hour-angle tracking. 


5.4.1 Desirable Characteristics of the Spatial Transfer 
Function 


As a first step in considering the layout of the antennas, it is useful to consider the 
desired spatial (u, v) coverage [see, e.g., Keto (1997)]. For any specific observation, 
the optimum (u, v) coverage clearly depends on the expected intensity distribution 
of the source under study, since one would prefer to concentrate the capacity of the 
instrument in (u, v) regions where the visibility is nonzero. However, most large 
arrays are used for a wide range of astronomical objects, so some compromise 
approach is required. Since, in general, astronomical objects are aligned at random 
in the sky, there is no preferred direction for the highest resolution. Thus, it is logical 
to aim for visibility measurements that extend over a circular area centered on the 
(u, v) origin. 

As described in Sect. 5.2.2, the visibility data may be interpolated onto a 
rectangular grid for convenience in Fourier transformation, and if approximately 
equal numbers of measurements are used for each grid point, they can be given 
equal weights in the transformation. Uneven weighting results in loss of sensitivity, 
since some values then contain a larger component of noise than others. From 
this viewpoint, one would like the natural weighting (i.e., the weighting of the 
measurements that results from the array configuration without further adjustment) 
to be as uniform as possible within the circular area. 

For a general-purpose array, it is difficult to improve on the circularity of 
the measurement area. However, there are exceptions to the uniformity of the 
measurements within the circle. As mentioned above, in the Mills cross, uniform 
coupling of the radiating elements along the arms would result in uniform spatial 
sensitivity. To reduce sidelobes, a Gaussian taper of the coupling was introduced, 
resulting in a similar taper in the spatial sensitivity. This was particularly important 
because at the frequencies for which this type of instrument was constructed, 
typically in the range 85—408 MHz, source confusion can be a serious problem. 
Sidelobe responses can be mistaken for sources and can also mask genuine sources. 
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For a spatial sensitivity function of uniform rectangular character, the beam has a 
sinc function (sin xx/xx) profile, for which the first sidelobe has a relative strength 
of 0.217. For a uniform, circular, spatial transfer function, the beam has a profile 
of the form J; (sx)/sx for which the first sidelobe has a relative strength of 0.132. 
Sidelobes for a uniform circular (u, v) coverage are less than for a rectangular one 
but would still be a problem in conditions of source confusion. Tapering of the 
antenna illumination reduces the sidelobe responses. Thus, the uniform weighting 
may not be optimum for conditions of high source density. 


5.4.2 Holes in the Spatial Frequency Coverage 


Consider a circular (u, v) area of diameter a, wavelengths in which there are no 
holes in the data; that is, the visibility data interpolated onto a rectangular grid 
for Fourier transformation has no missing values. Then for uniform weighting, the 
synthesized beam, which is obtained from the Fourier transform of the gridded 
transfer function, has the form J|(sra,@)/ma,@, where 0 is the angle measured 
from the beam center. If centrally concentrated weighting is used, the beam is a 
smoothed form of this function. Let us refer to the (u, v) area described above as the 
complete (u, v) coverage and the resulting beam as the complete response. Now if 
some data are missing, the actual (u, v) coverage is equal to the complete coverage 
minus the (u, v) hole distribution. By the additive property of Fourier transforms, the 
corresponding synthesized beam is equal to the complete response minus the Fourier 
transform of the hole distribution. The holes result in an unwanted component to 
the complete response, in effect adding sidelobes to the synthesized beam. From 
Parseval’s theorem, the rms amplitude of the hole-induced sidelobes is proportional 
to the rms value of the missing spatial sensitivity represented by the holes. Other 
sidelobes also occur as a result of the oscillations in the Jı (s1a,@)/ma)@ profile of 
the complete response, but there is clearly a sidelobe component from the holes. 


5.5 Linear Tracking Arrays 


We now consider interferometers or arrays in which the locations of the antennas are 
confined to a straight line. We have seen that for pairs of antennas with east-west 
spacings, the tracking loci in the (u, v) plane are a series of ellipses centered on the 
(u, v) origin. To obtain complete ellipses, it is necessary that the tracking covers a 
range of 12 h in hour angle. If the antenna spacings of an east—west array increase 
in uniform increments, the spatial sensitivity is represented by a series of concentric 
ellipses with uniform increments in their axes. The angular resolution obtained 
is inversely proportional to the width of the (u, v) coverage in the corresponding 
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Fig. 5.10 Two linear array configurations in which the antennas are represented by filled circles. 
(a) Arsac’s (1955) configuration containing all spacings up to six times the unit spacing, with no 
redundancy. (b) Bracewell’s (1966) configuration containing all spacings up to nine times the unit 
spacing, with the unit spacing occurring twice. 


direction; the width in the v direction is equal to that in the u direction times the 
sine of the declination, ô. East-west linear arrays containing spacings at multiples 
of a basic interval have found wide use, especially in earlier radio astronomy, for 
observations at || greater than ~ 30°. 

In the simplest type of linear array, the antennas are spaced at uniform intervals 
£, (see Fig. 1.13a). This type of array is sometimes known as a grating array, by 
analogy with an optical diffraction grating. If there are ng antennas, such an array 
output contains (na — 1) combinations with the unit spacing, (na — 2) with twice the 
unit spacing, and so on. Thus, short spacings are highly redundant, and one is led 
to seek other ways to configure the antennas to provide larger numbers of different 
spacings for a given ng. Note, however, that redundant observations can be used as 
an aid in calibration of the instrumental response and atmospheric effects, so some 
degree of redundancy is arguably beneficial (Hamaker et al. 1977). 

Early examples of antenna configurations include one in Fig.5.10a, used by 
Arsac (1955), with no redundant spacings. The six possible pair combinations 
all have different spacings. With more than four antennas, there is always either 
some redundancy or some missing spacings. A five-element, minimum-redundancy' 
configuration devised by Bracewell (1966) is shown in Fig.5.10b. Moffet (1968) 
listed examples of minimum-redundancy arrays of up to 11 elements, and solutions 
for larger arrays are discussed by Ishiguro (1980). Moffet defined two classes. These 
are restricted arrays in which all spacings up to the maximum spacing, Mmax¢ (that 
is, the total length of the array), are present; general arrays in which all spacings up 
to some particular value are present; and also some longer ones. Examples for eight 
elements are shown in Fig. 5.11. A measure of redundancy for a linear array is given 
by the expression 


1 
grela T 1) /Mmax g (5.18) 


which is the number of antenna pairs divided by the number of unit spacings in the 
longest spacing. This is equal to 1.0 and 1.11 for the configurations in Fig. 5.10a 


!The mathematical theory of minimum redundancy is known as the optimal Golomb ruler (Golomb 
1972), which has roots in the mathematical literature going back to the 1930s. 
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Fig. 5.11 Eight-element, minimum-redundancy, linear arrays: the numbers indicate spacings in 
multiples of the unit spacing. (a) Two arrays that uniformly cover the range of 1 to 23 times the 
unit spacing. (b) An array that uniformly covers 1 to 24 times the unit spacing but has a length of 
39 times the unit spacing. The extra spacings are 8, 31 (twice), and 39 times the unit spacing. © 
1968 IEEE. Reprinted with permission, from A. T. Moffet (1968). 


and 5.10b, respectively. A study in number theory by Leech (1956) indicates that 
for large numbers of elements, this redundancy factor approaches 4/3. A linear 
minimum-redundancy array that uses the configuration in Fig. 5.10b is described by 
Bracewell et al. (1973). For arrays with such small numbers of antennas, the choice 
of the configuration is particularly important. 

The ability to move a small number of elements adds greatly to the range of 
performance of an array. Figure 5.12 shows the arrangement of three antennas 
in an early synthesis instrument, the Cambridge One-Mile Radio Telescope (Ryle 
1962). Antennas 1 and 2 are fixed, and their outputs are correlated with that from 
antenna 3, which can be moved on a rail track. In each position of antenna 3, the 
source under observation is tracked for 12 h, and visibility data are obtained over 
two elliptical loci in the (u, v) plane. The observation is repeated as antenna 3 is 
moved progressively along the track, and the increments in the position of this 
antenna determine the spacing of the elliptical loci in the (u, v) plane. From the 
sampling theorem (Sect. 5.2.1), the required (u, v) spacing is the reciprocal of the 
angular width (in radians) of the source under investigation. The ability to vary the 
incremental spacing adds versatility to the array and reduces the number of antennas 
required. The configuration of a larger instrument of this type, the Westerbork 
Synthesis Radio Telescope (Baars and Hooghoudt 1974; Högbom and Brouw 1974; 
Raimond and Genee 1996), is shown in Fig. 5.13. Here, ten fixed antennas are 


1 2 3 
e Rail track ————> 
Fig. 5.12 The Cambridge One-Mile Radio Telescope. Antennas 1 and 2 are at fixed locations, and 
the signals they receive are each correlated with the signal from antenna 3, which can be located 
at various positions along a rail track. The fixed antennas are 762 m apart, and the rail track is a 


further 762 m long. The unit spacing is equal to the increment of the position of antenna 3, and all 
multiples up to 1524 m can be obtained. 
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Fig. 5.13 Antenna configuration of the Westerbork Synthesis Radio Telescope. The ten filled 
circles represent antennas at fixed locations, and the four open circles represent antennas that are 
movable on rail tracks. The signals from each of the fixed antennas are combined with the signals 
from each of the movable ones. The diameter of the antennas is 25 m, and the spacing of the fixed 
antennas is 144 m. 


combined with four movable ones, and the rate of gathering data is approximately 
20 times greater than with the three-element array. 

The sampling of the visibility function at points on concentric, equally spaced 
ellipses results in the introduction of ringlobe responses. These may be understood 
by noting that for a linear array, the instantaneous spacings are represented in one 
dimension by a series of ô functions, as shown in Fig. 5.14a. If the array contains 
all multiples of the unit spacings up to N€,, and if the corresponding visibility 
measurements are combined with equal weights, the instantaneous response is a 
series of fan beams, each with a profile of sinc-function form, as in Fig.5.14b. 
This follows from the Fourier transform relationship for a truncated series of delta 
functions (see Appendix 2.1): 


N s [oe] 
y (jp TNE l, pD s(1- =) (5.19) 


i=—N mej k=—oo 


The delta functions on the left side represent the spacings in the u domain. The series 
on the left is truncated and can be envisaged as selected from an infinite series by 
multiplication with a rectangular window function. The right side represents the 


ol 


Fig. 5.14 Part of a series of 6 functions representing the instantaneous distribution of spacings for 
a uniformly spaced linear array with equal weight for each spacing. (b) Part of the corresponding 
series of fan beams that constitute the instantaneous response. Parts (a) and (b) represent the left 
and right sides of Eq. (5.19), respectively. 
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Fig. 5.15 Example of ringlobes. The response of an array for which the spatial transfer function is 
a series of nine circles concentric with the (u, v) origin, resulting, for example, from observations 
with an east-west linear array with 12-h tracking at a high declination. The radii of these circles 
are consecutive integral multiples of the unit antenna spacing. The weighting corresponds to 
the principal response discussed in Sect. 10.2. From Bracewell and Thompson (1973). © AAS. 
Reproduced with permission. 


beam pattern in which the Fourier transform of the window function is replicated by 
convolution with delta functions. As the Earth’s rotation causes the spacing vectors 
to sweep out ellipses in the (u, v) plane, the corresponding rotation of the array 
relative to the sky can be visualized as causing a central fan beam to rotate into 
a narrow pencil beam, while its neighbors give rise to lower-level, ring-shaped 
responses concentric with the central beam, as shown in Fig.5.15. This general 
argument gives the correct spacing of the ringlobes, the profile of which is modified 
from the sinc-function form. 

If the spatial sensitivity in the (u, v) plane is a series of circular delta functions 
of radius q, 2q, . . . , Nq, the profile of the kth ringlobe is of the form 


sinc!/? [2w + Dar = J : (5.20) 


where r = ~VE +m. The function sinc!/?(y) is plotted in Fig. 5.16 and is the 
half-order derivative of (sinzy)/zy. It can be computed using Fresnel integrals 
(Bracewell and Thompson 1973). 

The application of the sampling theorem (Sect. 5.2.1) to the choice of incremental 
spacing requires that the increment be no greater than the reciprocal of the source 
width. In terms of ringlobes, this condition ensures that the minimum ringlobe 
spacing is no less than the source width. Thus, if the sampling theorem is followed, 
the main-beam response to a source just avoids being overlapped by a ringlobe 
response to the same source. In arrays such as those in Figs. 5.12 and 5.13, ringlobes 
can be effectively suppressed if the movable antennas are positioned in steps slightly 
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Fig. 5.16 Cross section of a ringlobe in the principal response to a point source of an east—west 
array with uniform increments in antenna spacing. The left side is the inside of the ring, and the 
right is the outside. The dotted line indicates a negative mean level of the oscillations on the inner 
side. From Bracewell and Thompson (1973). © AAS. Reproduced with permission. 


less than the antenna diameter, in which case the ringlobe lies outside the primary 
antenna beam. Note, however, that the first spacing cannot be less than the antenna 
diameter, and the missing low-spacing measurements may have to be obtained by 
other means (see the discussion of mosaicking in Sect. 11.5). Ringlobes can also 
be greatly reduced by image-processing techniques such as the CLEAN algorithm, 
which is described in Sect. 11.1. 

Although the elliptical loci in the (u, v) plane are spaced at equal intervals, the 
natural weighting of the data for an east-west linear array is not uniform, because 
in any interval of time, the antenna-spacing vectors move a distance proportional 
to their length. In the projection of the (u, v) plane onto the equatorial plane of the 
Earth, which is discussed in Sect. 4.2 as the (w’, v’) plane, the spacing vectors rotate 
at constant angular velocity, and the density of measured points is proportional to 


q™' = (u? J v?" = (wv Ji v?cosec?8) 1? . (5.21) 


In the (u, v) plane, the density of measurements, averaged over an area of dimen- 
sions comparable to the unit spacing of the antennas, is inversely proportional to 
Vu? + v?cosec?8. Along a straight line through the (u, v) origin, the density is 
inversely proportional to vu? + v?. 
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5.6 Two-Dimensional Tracking Arrays 


As noted previously, the spatial frequency coverage for an east—west linear array 
becomes severely foreshortened in the v dimension for observations near the 
celestial equator. For such observations, a configuration of antennas is required in 
which the Z component of the antenna spacing, as defined in Sect. 4.1, is comparable 
to the X and Y components. This is achieved by including spacings with azimuths 
other than east-west. The configuration is then two-dimensional. An array located 
at an intermediate latitude and designed to operate at low declinations can cover the 
sky from the pole to declinations of about 30° into the opposite celestial hemisphere. 
This range includes about 70% of the total sky, that is, almost three times as much 
as that of an east-west array. Since the Z component is not zero, the elliptical 
(u, v) loci are broken into two parts, as shown in Fig. 4.4. As a result, the pattern 
of the (u, v) coverage is more complex than is the case for an east-west linear 
array, and the ringlobes that result from uniform spacing of the loci are replaced 
by more complex sidelobe structure. In two dimensions, the choice of a minimum- 
redundancy configuration of antennas is not as simple as for a linear array. A first 
step is to consider the desired spatial transfer function W(u, v). There is no direct 
analytical way to go from W(u, v) to the antenna configuration, but iterative methods 
of finding an optimum, or near-optimum, solution can be used. 

First, consider the effect of tracking a source across the sky, and suppose that 
for a source near the zenith, the instantaneous spatial frequency coverage results 
in approximately uniform sampling within a circle centered on the (u, v) origin. 
At any time during the period of tracking of the source, the (u, v) coverage is the 
zenith coverage projected onto the plane of the sky, with some degree of rotation 
that depends on the hour angle and declination of the source. The projection results 
in foreshortening of the coverage from a circular to an elliptical area, still centered 
on the (u, v) origin, and this foreshortening is least at meridian transit. The effect 
of observing over a range of hour angle can be envisaged as averaging a range of 
elliptical (u, v) areas that suffer some rotation of the major axis. At the center of 
the (u,v) plane will be an area that remained within the foreshortened coverage 
over the whole observation, and if the instantaneous coverage is uniform, then it 
will remain uniform within this area. Outside the area, the foreshortening will cause 
the coverage to taper off smoothly. These effects depend on the declination of the 
source and the range of hour-angle tracking. Practical experience indicates that some 
tapering of the visibility measurements is seldom a serious problem. Thus, it can 
generally be expected that two-dimensional arrays in which the number of antennas 
is large enough to provide good instantaneous (u, v) coverage will also provide good 
performance when used with hour-angle tracking. 
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5.6.1 Open-Ended Configurations 


For configurations with open-ended arms such as the cross, T, and Y, the spatial 
frequency coverage is shown in Fig. 5.7. The spatial frequency coverage of the cross 
and T has fourfold symmetry in both cases; we ignore the effect of the missing small 
extensions on the top and bottom sides of the square for the T. The spatial frequency 
coverage of the equiangular Y-shaped array (120° between adjacent arms) has 
sixfold symmetry. (n-fold symmetry denotes a figure that is unchanged by rotation 
through 27:/n. For a circle, n becomes infinite, and other figures approach circular 
symmetry as n increases.) The autocorrelation function of the equiangular Y-shaped 
array is closer to circular symmetry than that of a cross or T-shaped array. In this 
respect, a five-armed array, as suggested by Hjellming (1989), would be better still, 
but more expensive. 

As an example of the open-ended configuration, we examine some details of 
the design of the VLA (Thompson et al. 1980; Napier et al. 1983; Perley et al. 
2009). This array is located at latitude 34° N in New Mexico and is able to track 
objects as far south as —30° for almost 7 h without going below 10° in elevation. 
Performance specifications called for imaging with full resolution down to at least 
—20° declination and for obtaining an image in no more than 8 h of observation 
without moving antennas to new locations. In designing the array, comparison of 
the performance of various antenna configurations was accomplished by computing 
the spatial transfer function with tracking over an hour-angle range +4 h at various 
declinations. In judging the merit of any configuration, the basic concern was to 
minimize sidelobes in the synthesized beam. It was found that the percentage of 
holes in the (u, v) coverage was a consistent indication of the sidelobe levels of 
the synthesized beam, and to judge between different configurations, it was not 
always necessary to calculate the detailed response (National Radio Astronomy 
Observatory 1967, 1969). For a given number of antennas, the equiangular Y-shaped 
array was found to be superior to the cross and T-shaped array; see Fig. 5.17. 

Inverting the Y has no effect on the beam, but if the antennas have the same 
radial disposition on each arm, the performance near zero declination is improved 
by rotating the array so that the nominal north or south arm makes an angle of 
about 5° with the north-south direction. Without this rotation, the baselines between 
corresponding antennas on the other two arms are exactly east-west, and for 6 = 0°, 
the spacing loci degenerate to straight lines that are coincident with the u axis and 
become highly redundant. The total number of antennas, 27, was chosen from a 
consideration of (u, v) coverage and sidelobe levels and resulted in peak sidelobes 
at least 16 dB below the main-beam response, except at 6 = 0°, where Earth rotation 
is least effective. The 27 antennas provide 351 pair combinations. 

The positions of the antennas along the arms provide another set of variables 
that can be adjusted to optimize the spatial transfer function. Figure 5.17 shows 
two approaches to the problem. Configuration (a) was obtained by using a pseudo- 
dynamic computation technique (Mathur 1969), in which arbitrarily chosen initial 
conditions were adjusted by computer until a near-optimum (u, v) coverage was 
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(a) 


Fig. 5.17 (a) Proposed antenna configuration for the VLA that resulted from Mathur’s (1969) 
computer-optimized design. (b) Power-law design (Chow 1972) adopted for the VLA. © 1983 
IEEE. Reprinted, with permission, from P. J. Napier et al. (1983). 


reached. Configuration (b) shows a power-law configuration derived by Chow 
(1972). This analysis led to the conclusion that a spacing in which the distance of 
the nth antenna on an arm is proportional to n” would provide good (u, v) coverage. 
Comparison of the empirically optimized configuration with the power-law spacing 
with a ~ 1.7 showed the two to be essentially equal in performance. The power- 
law result was chosen largely for reasons of economy. A requirement of the design 
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was that four sets of antenna stations be provided to vary the scale of the spacings in 
four steps, to allow a choice of resolution and field of view for different astronomical 
objects. By making a equal to the logarithm to the base 2 of the scale factor between 
configurations, the location of the nth station for one configuration coincides with 
that of the 2nth station for the next-smaller configuration. The total number of 
antenna stations required was thereby reduced from 108 to 72. Plots of the spatial 
frequency coverage are shown in Fig.5.18. The snapshot in Fig.5.18d shows the 
instantaneous coverage, which is satisfactory for imaging simple structure in strong 
sources. 


Fig. 5.18 Spatial frequency coverage for the VLA with the power-law configuration of Fig. 5.17b: 
(a) 6 = 45°; (b) 5 = 30°; (c) 5 = 0°; (d) snapshot at zenith. The range of hour angle is +4 h or 
as limited by a minimum pointing elevation of 9°, and +5 min for the snapshot. The lengths of the 
(u, v) axes from the origin represent the maximum distance of an antenna from the array center, 
that is, 21 km for the largest configuration. © 1983 IEEE. Reprinted, with permission, from P. J. 
Napier et al. (1983). 
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5.6.2 Closed Configurations 


The discussion here largely follows that of Keto (1997). Returning to the proposed 
criterion of uniform distribution of measurements within a circle in the (u, v) plane, 
we note that a configuration of antennas around a circle (a ring array) provides a 
useful starting point since the distribution of antenna spacings cuts off sharply in 
all directions at the circle diameter. This is shown in Fig.5.7g and h. We begin 
by considering the instantaneous (u, v) coverage for a source at the zenith. This is 
shown in Fig. 5.19a for 21 equally spaced antenna locations indicated by triangles. 
There are 21 antenna pairs at the unit spacing, uniformly distributed in azimuth, and 
each of these is represented by two points in the (u, v) plane. The same statement 
can be made for any other paired spacings around the circle. As a result, the spatial 
transfer function consists of points that lie on a pattern of circles and radial lines. 
Note also that as the spacings approach the full diameter of the circle, the distance 
between antennas increases only very slowly. For example, the direct distance 
between antennas spaced 10 intervals around the circle is very little more than that 
for antennas at 9 intervals. Thus, there is an increase in the density of measurements 
at the longest spacings (the points along any radial line become more closely spaced) 
as well as a marked increase toward the center. Note that the density of points closely 
follows the radial profile of the autocorrelation function in Fig. 5.7j, except close to 
the origin, since Fig. 5.19 includes only cross-correlations between antennas. 

One way of obtaining a more uniform distribution is to randomize the spacings 
of the antennas around the circle. The (u, v) points are then no longer constrained 
to lie on the pattern of circles and lines, and Fig.5.19b shows an example in 
which a partial optimization has been obtained by computation using a neural-net 
algorithm. Keto (1997) discussed various algorithms for optimizing the uniformity 
of the spatial sensitivity. An earlier investigation of circular arrays by Cornwell 
(1988) also resulted in good uniformity within a circular (u, v) area. In this case, 
an optimizing program based on simulated annealing was used, and the spacing of 
the antennas around the circle shows various degrees of symmetry that result in 
patterns resembling crystalline structure in the (u, v) spacings. 

Optimizing the antenna configurations can also be considered more broadly, and 
Keto (1997) noted that the cutoff in spacings at the same value for all directions 
is not unique to the circular configuration. There are other figures, such as the 
Reuleaux triangle, for which the width is constant in all directions. The Reuleaux 
triangle is shown in Fig. 5.7i and consists of three equal circular arcs indicated by 
the solid lines. The total perimeter is equal to that of a circle with diameter equal 
to one of the sides of the equilateral triangle shown by the broken lines. Similar 
figures can be constructed for any regular polygon with an odd number of sides, 
and a circle represents such a figure for which the number tends to infinity. The 
Reuleaux triangle is the least symmetrical of this family of figures. Other facts about 
the Reuleaux triangle and similar figures can be found in Rademacher and Toeplitz 
(1957). 
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Fig. 5.19 (a) A circular array with 21 uniformly spaced antennas indicated by the triangles, and 
the instantaneous spatial frequency coverage indicated by the points. The scale of the diagrams 
is the same for both the antenna positions and the spatial frequency coordinates u and v. (b) The 
array and spatial frequency coverage as in (a) but after adjustment of the antenna positions around 
the circle to improve the uniformity of the coverage. (c) An array of 24 antennas equally spaced 
around a Reuleaux triangle, and the corresponding spatial frequency coverage. (d) The array and 
spatial sensitivity as in (c) with adjustment of the antenna spacing to optimize the uniformity of 
the coverage. From Keto (1997). © AAS. Reproduced with permission. 


Since the optimization of the circular array in Fig.5.19b results in a reduction 
in the symmetry, it may be expected that an array based on the Reuleaux triangle 
would provide better uniformity in the spatial frequency coverage than the circular 
array. This is indeed the case, as can be seen by comparing Figs. 5.19a and c, where 
the spacing between adjacent antennas for both is uniform. The circular array with 
irregular antenna spacings in Fig.5.19b was obtained by starting with a circular 
array and allowing antenna positions to be moved small distances. In this case, 
the program was not allowed to reach a fully optimized solution. Allowing the 
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optimization to run to convergence results in antennas at irregular spacings around 
a Reuleaux triangle, as shown in Fig.5.19d. This result does not depend on the 
starting configuration. Comparison of Figs.5.19b and d shows that the difference 
between the circle and the Reuleaux triangle is much less marked when they have 
both been subjected to some randomization of the antenna positions around the 
figure, although a careful comparison shows the uniformity in Fig.5.19d to be a 
little better than in b. 

Figure 5.20 shows the spatial frequency coverage for an array in an optimized 
Reuleaux triangle configuration. The tracking range is ~ +3 h of hour angle, and the 
latitude is equal to that of the VLA. Comparison of these figures with corresponding 
ones for the VLA in Fig.5.18 shows that the Reuleaux triangle produces spatial 
frequency coverage that is closer to the uniformly sampled circular area than 
does the equiangular Y configuration. As indicated in Fig.5.7, the autocorrelation 
function of a figure with linear arms contains high values in directions where the 
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Fig. 5.20 Spatial frequency coverage for a closed configuration of 24 antennas optimized for 
uniformity of measurements in the snapshot mode: (a) snapshot at zenith; (b) 5 = +30°; (ce) 
5 = 0°; (b) 6 = —28°. The triangles in (a) indicate the positions of the antennas. The tracking is 
calculated for an array at 34° latitude to simplify comparison with the VLA (Fig. 5.18). For each 
declination shown, the tracking range is the range of hour angle for which the source elevation is 
greater than 25°. From Keto (1997). © AAS. Reproduced with permission. 
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arms of overlapping figures line up. This effect contributes to the lack of uniformity 
in the spatial sensitivity of the Y-shaped array. Curvature of the arms or quasi- 
random lateral deviations of the antennas from the arms helps to smear the sharp 
structure in the spatial transfer function. The high values along radial lines do not 
occur in the autocorrelation function of a circle or similar closed figure, which is 
one reason why configurations of this type provide more uniform spatial frequency 
coverage. 

Despite some less-than-ideal features of the equiangular Y-shaped array, the 
VLA produces astronomical images of very high quality. Thus, although the 
circularity and uniformity of the spatial frequency coverage are useful criteria, 
they are not highly critical factors. As long as the measurements cover the range 
of u and v for which the visibility is high enough to be measurable, and the 
source is strong enough that any loss in sensitivity resulting from nonuniform 
weighting can be tolerated, excellent results can be obtained. The Y-shaped array 
has a number of practical advantages over a closed configuration. When several 
scaled configurations are required to allow for a range of angular resolution, the 
alternative locations lie along the same arms, whereas with the circle or Reuleaux 
triangle, separate scaled configurations are required. The flexibility of the Y-shaped 
array is particularly useful in VLA observations at southern declinations for which 
the projected spacings are seriously foreshortened in the north-south direction. 
For such cases, it is possible to move the antennas on the north arm onto the 
positions for the next-larger configuration and thereby substantially compensate for 
the foreshortening. 

Some further interesting examples of configurations are given below. 


e The compact array of the Australia Telescope is an east-west linear array of six 
antennas, all movable on a rail track (Frater et al. 1992). 

¢ The UTR-2 is a T-shaped array of large-diameter, broadband dipoles built by the 
Ukrainian Academy of Sciences near Grakovo, Ukraine (Braude et al. 1978). The 
frequency range of operation is 10-25 MHz. Several smaller antennas of similar 
type have been constructed at distances up to approximately 900 km from the 
Grakovo site and are used for VLBI observations. 

e An array of 720 conical spiral antennas in a T-shaped configuration operating in 
the frequency range 15—125 MHz was constructed at Borrego Springs, California 
(Erickson et al. 1982). 

e The Mauritius Radio Telescope, near Bras d’eau, Mauritius, is a T-shaped array 
of helix antennas operating at 150 MHz. The east—west arm is 2 km long. The 
south arm is 880 m long and is synthesized by moving a group of antennas on 
trolleys. The array is similar in principle to the one in Fig. 1.12a. It is intended to 
cover a large portion of the Southern Hemisphere. 

e The GMRT (Giant Metrewave Radio Telescope) near Pune, India, consists of 30 
antennas, 16 of which are in a Y-shaped array with curved arms approximately 
15 km long. The remaining 14 are in a quasi-random cluster in the central 2 km 
(Swarup et al. 1991). The antennas are 45 m in diameter and are at fixed locations. 
The highest operating frequency is approximately 1.6 GHz. 
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e A circular array with 96 uniformly spaced antennas was constructed at Culgoora, 
Australia, for observations of the Sun (Wild 1967). This was a multibeam, 
scanning, phased array rather than a correlator array, consisting of 96 antennas 
uniformly spaced around a circle of diameter 3 km and operating at 80 and 
160 MHz. To suppress unwanted sidelobes of the beam, Wild (1965) devised 
an ingenious phase-switching scheme called J? synthesis. The spatial sensitivity 
of this ring array was analyzed by Swenson and Mathur (1967). 

e The Multielement Radio-Linked Interferometer Network (MERLIN) of the 
Jodrell Bank Observatory, England, consists of six antennas with baselines up 
to 233 km (Thomasson 1986). 

e The Submillimeter Array (SMA) of the Smithsonian Astrophysical Observatory 
and Academia Sinica of Taiwan, located on Mauna Kea, Hawaii, is the first array 
to be built using a Reuleaux triangle configuration (Ho et al. 2004). 

e In large arrays in which the antennas cover areas extending over several 
kilometers, there is usually a central area with relatively dense antenna coverage, 
surrounded by extensive areas with sparser coverage. These outer parts may be 
in the form of extended arms, but the placement of the individual antennas is 
often irregular as a result of details of the landscape. Examples include ALMA 
(Wootten and Thompson 2009), the Murchison Widefield Array (Lonsdale et al. 
2009), the Australian SKA Pathfinder (DeBoer et al. 2009), and the Low- 
Frequency Array (LOFAR) (de Vos et al. 2009). For discussion of projects for 
large arrays, see Carilli and Rawlings (2004). 


5.6.3 VLBI Configurations 


In VLBI (very-long-baseline interferometry) arrays, which are discussed in more 
detail in Chap. 9, the layout of antennas results from considerations of both (u, v) 
coverage and practical operating requirements. During the early years of VLBI, 
the signals were recorded on magnetic tapes that were then sent to the correlator 
location for playback. The use of tape has been superseded by magnetic disks and in 
some cases by direct transmission of the signals to the correlator using fiberoptic or 
other transmission media. Observing periods are limited by the ranges of hour angle 
and declination that are simultaneously observable from widely spaced locations. 
Although these locations usually deviate significantly from a plane, the angular 
widths of the sources under observation are generally sufficiently small that the 
small-field approximation (i.e., / and m small) can be used in deriving the radio 
image, as in Eq. (3.9). 

For the first two decades after the inception of the VLBI technique, observations 
were mainly joint ventures among different observatories. Consideration of arrays 
dedicated solely to VLBI occurred as early as 1975 (Swenson and Kellermann 
1975), but construction of such arrays did not begin for another decade. A study 
of antenna locations for a VLBI array has been discussed by Seielstad et al. (1979). 
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To obtain a single index as a measure of the performance of any configuration, the 
spatial transfer function was computed for a number of declinations. The fraction 
of appropriately sized (u, v) cells containing measurements was then weighted in 
proportion to the area of sky at each declination and averaged. Maximizing the 
index, in effect, minimizes the number of holes (unfilled cells). Other studies have 
involved computing the response to a model source, synthesizing an image, and 
improving the model as necessary. 

The design of an array dedicated to VLBI, the Very Long Baseline Array (VLBA) 
of the United States, is described by Napier et al. (1994). The antenna locations [and 
associated (u, v) loci] are shown in Fig. 5.21 and listed in Table 5.1. A discussion 
of the choice of sites is given by Walker (1984). Antennas in Hawaii and St. Croix 
provide long east-west baselines. New Hampshire to St. Croix is the longest north— 
south spacing. A site in Alaska would be farther north but would be of limited 
benefit because it would provide only restricted accessibility for sources at southern 
declinations. An additional site within the Southern Hemisphere would enhance 
the (u, v) coverage at southern declinations. The southeastern region of the United 
States is avoided because of the higher levels of water vapor in the atmosphere. 
Intermediate north-south baselines are provided by the drier West Coast area. The 
Iowa site fills in a gap between New Hampshire and the southwestern sites. The 
short spacings are centered on the VLA, and as a result, the spatial frequency 
coverage shows a degree of central concentration. This enables the array to make 
measurements on a wider range of source sizes than would be possible with the 
same number of antennas and more uniform coverage. However, this results in some 
sacrifice in capability for imaging complex sources. 


5.6.4 Orbiting VLBI Antennas 


The discussion of placing a VLBI station in Earth orbit to work with ground-based 
arrays started as early as 1969 (Preston et al. 1983; Burke 1984; Kardashev et al. 
2013). The combination of orbiting VLBI (OVLBI) and ground-based antennas has 
several obvious advantages. Higher angular resolution can be achieved, and the 
ultimate limit may be set by interstellar scintillation (see Sect. 14.4). The orbital 
motion of the spacecraft helps to fill in the coverage in the (u, v) plane and has 
the potential to improve the detail and dynamic range in the resulting images. 
Furthermore, a satellite in low Earth orbit provides rapid (u, v) plane variation, 
which can be valuable for obtaining information on time variability of source 
structure. 

Figure 5.22 shows an example of the (u, v) coverage for observations with the 
VSOP project spacecraft known as HALCA (Hirabayashi et al. 1998) and a series 
of terrestrial antennas: one at Usuda, Japan, one at the VLA site, and the ten VLBA 
antennas. The spacecraft orbit is inclined at an angle of 31° to the Earth’s equator, 
and the height above the Earth’s surface is 21,400 km at apogee and 560 km at 
perigee. The mission of this spacecraft was to extend the resolution by a factor of 
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Fig. 5.21 Very Long Baseline Array in the United States: (a) locations of the ten antennas, and 
(b) spatial frequency coverage (spacings in thousands of kilometers) for declinations of 64°, 30°, 
6°, and —18°, in which the observing time at each antenna is determined by an elevation limit of 
10°. From Walker (1984). Reprinted with the permission of and © Cambridge University Press. 
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Table 5.1 Locations of antennas in the VLBA* 


N. Latitude W. Longitude Elevation | 

Location (deg min sec) (deg min sec) (m) 

St. Croix, VI 17 45 30.57 64 35 02.61 16 
Hancock, NH 42 56 00.96 71 59 11.69 309 
N. Liberty, IA 41 46 17.03 91 34 26.35 241 
Fort Davis, TX 30 38 05.63 103 56 39.13 1615 
Los Alamos, NM 35 46 30.33 106 14 42.01 1967 
Pie Town, NM 34 18 03.61 108 07 07.24 2371 
Kitt Peak, AZ 31 57 22.39 111 36 42.26 1916 
Owens Valley, CA 37 13 54.19 118 16 33.98 1207 
Brewster, WA 48 07 52.80 119 40 55.34 255 
Mauna Kea, HI 19 48 15.85 155 27 28.95 3720 


*© 1994 IEEE. Reprinted, with permission, from P. J. Napier et al. (1994). 
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Fig. 5.22 (u, v) plane tracks for arrays with a satellite station for the source 1622+633 at 5 GHz. 
(left) Coverage with VSOP and 12 ground-based antennas. The roughly circular tracks within 2 x 
10° are the baselines among the ground-based antennas. Produced with the FAKESAT software 
developed by D. W. Murphy, D. L. Meier, and T. J. Pearson. (right) Coverage with RadioAstron 
and six ground-based antennas. The gaps in the coverage correspond to actual satellite constraints 
for hypothetical observations in February 2016. The satellite period is 8.3 days, and the “wobbly” 
appearance of the tracks is caused by the Earth’s diurnal motion. Produced with the FAKERAT 
software, a derivative of FAKESAT (http://www.asc.rssi.ru/radioastron/software/fakerat). 


three over ground-based arrays and to retain good imaging capability. The spacings 
shown are for a frequency of 5 GHz, and the units of u and v are 10° wavelengths; 
the maximum spacing is 5 x 108 wavelengths, which corresponds to a fringe width 
of 0.4 mas. The approximately circular loci at the center of the figure represent 
baselines between terrestrial antennas. The orbital period is 6.3 h, and the data 
shown correspond to an observation of duration about four orbital periods. The 
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spacecraft orbit precesses at a rate of order 1° per day, and over the course of one 
to two years, the coverage of any particular source can be improved by combining 
observations. 

Figure 5.22 also shows examples of the (u, v) coverage for observations with the 
RadioAstron project spacecraft known as Spektr-R (Kardashev et al. 2013) and a set 
of ground-based antennas. The spacecraft orbit is inclined at an angle of 80° to the 
Earth’s equator, and, for the case shown here, the ellipticity is 0.86, and the height 
above the Earth’s surface is 289,000 km at apogee and 47,000 km at perigee (orbit 
on April 14, 2012). The mission of RadioAstron is to provide ultrahigh resolution 
to explore new astrophysical phenomena while sacrificing imaging quality because 
of the gap between satellite—Earth and Earth-only baselines. The orbital period is 
8.3 days. The orbit evolves substantially with time because of the influences of the 
Sun and Moon. Occasions when the orbit eccentricity reaches its maximum of 0.95 
offer opportunities for better imaging capability. 

Figure 5.23 shows an example of the (u,v) coverage that could be obtained 
between two spacecraft in circular orbits of radius about ten Earth radii, with 
orthogonal planes that have periods differing by 10%. Multispacecraft operation 
offers satellite-to-satellite baselines, which are free from the effects of atmospheric 
delay. In practice, there are likely to be restrictions on coverage resulting from the 
limited steerability of the astronomy and communication antennas relative to the 
spacecraft. It is necessary for the spacecraft to maintain an attitude in which the solar 
power panels remain illuminated and the communications antenna can be pointed 
toward the Earth. Further discussion of orbiting VLBI is given in Sect. 9.10. 
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Fig. 5.23 Spatial frequency coverage for two antennas on satellites with circular orbits of radius 
approximately ten times the Earth’s radius Rg: (a) source along the X axis; (b) source along Y or 
Z axes; (c) source centered between X, Y, and Z axes. The orbits lie in the XY and XZ planes of 
a rectangular coordinate system. The satellite periods differ by 10%, and the observing period is 
approximately 20 days. From R. A. Preston et al. (1983), © Cépadués Editions, 1983. 
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5.6.5 Planar Arrays 


Studies of cosmic background radiation and the Sunyaev—Zel’dovich effect require 
observations with very high brightness sensitivity at wavelengths of order 1 cm and 
shorter: see also Sect. 10.7. Unlike the sensitivity to point sources, the sensitivity to 
a broad feature that largely fills the antenna beam does not increase with increasing 
collecting area of the antenna. Thus, for cosmic background measurements, large 
antennas are not required. Extremely good stability is necessary to allow significant 
measurements at the level of a few tens of microkelvins per beam, that is, of order 
10 uJy arcmin™?. Special arrays have been designed for this purpose. A number of 
antennas are mounted on a platform, with their apertures in a common plane. The 
whole structure is then supported on an altazimuth mount so the antennas can be 
pointed to track any position on the sky. An example of such an instrument, the 
Cosmic Background Imager (CBI), was developed by Readhead and colleagues at 
Caltech (Padin et al. 2001). Thirteen Cassegrain focus paraboloids, each of diameter 
90 cm, were operated in the 26- to 36-GHz range. In this instrument, the antenna 
mounting frame had the shape of an irregular hexagon with threefold symmetry 
and maximum dimensions of approximately 6.5 m, as shown in Fig.5.24. For the 
particular type of measurements required, the planar array has a number of desirable 
properties compared with a single antenna of similar aperture, or a number of 
individually mounted antennas, as outlined below: 


e The use of a number of individual antennas allows the output to be measured 
in the form of cross-correlations between antenna pairs. Thus, the output is not 
sensitive to the total power of the receiver noise but only to correlated signals 
entering the antennas. The effects of gain variations are much less severe than 
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Fig. 5.24 (a) Face view of the antenna platform of the Cosmic Background Imager, showing a 
configuration of the 13 antennas. (b) The corresponding antenna spacings in (u, v) coordinates for 
a wavelength of approximately 1 cm. 
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in the case of a total-power receiver. Thermal noise from ground pickup in the 
sidelobes is substantially resolved. 

e The antennas can be mounted with the closest spacing physically possible. There 
are then no serious gaps in the spatial frequencies measured, and structure can be 
imaged over the width of the primary antenna beams. The apertures cannot block 
one another because the antenna platform tracks, as can occur for individually 
mounted antennas in closely spaced arrays. 

e In the array in Fig. 5.24, the whole antenna mounting platform can be rotated 
about an axis normal to the plane of the apertures. Thus, rotation of the baselines 
can be controlled as desired and is independent of Earth rotation. For a constant 
pointing direction and rotation angle relative to the sky, the pattern of (u, v) 
coverage remains constant as the instrument tracks. Variations in the correlator 
outputs with time can result from ground radiation in the sidelobes, which varies 
with azimuth and elevation as the array tracks. This variation can help to separate 
out the unwanted response. 

e The close spacing of the antennas results in some cross coupling by which 
spurious correlated noise is introduced into the receiving channels of adjacent 
antennas. However, because the antennas are rigidly mounted, the coupling 
does not vary as the system tracks a point on the sky, as is the case for 
individually mounted antennas. The effects of the coupling are therefore more 
easily calibrated out. In the CBI design, the coupling is reduced to —110 to 
—120 dB by the use of a cylindrical shield around each antenna and by designing 
the subreflector supports to minimize scattering. 


At a frequency of 30 GHz, a pointing error of 1” in a 6-m baseline produces 
a visibility phase error of 1°. Pointing accuracy is critical, and the CBI antenna is 
mounted in a retractable dome to shield it from wind, which can be strong at the 
5000-m-elevation site at Llano de Chajnantor, Chile. Observations of the cosmic 
microwave background with this system are briefly described in Sect. 10.7. 


5.6.6 Some Conclusions on Antenna Configurations 


The most accurate prediction of the performance of an array is obtained by 
computation of the response of the particular design to models of sources to 
be observed. However, here we are more concerned with broad comparisons of 
various configurations to illustrate the general considerations in array design. Some 
conclusions are summarized below. 


e A circle centered on the (u, v) origin can be considered an optimum boundary 
for the distribution of measurements of visibility. Uniformity of the distribution 
within the circle is a further useful criterion in many circumstances. An exception 
is the condition in which sidelobes of the synthesized beam are a serious problem, 
for example, in low-frequency arrays operating in conditions of source confusion. 
In arrays in which the scale of the configuration cannot be varied to accommodate 
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a wide range of source dimensions, a centrally concentrated distribution allows a 
greater range of angular sizes to be measured with a limited number of antennas. 
If sensitivity to broad, low-brightness objects is important, it is preferable to 
have more antenna pairs with short spacings at which such sources are not 
highly resolved. Note that two of the largest arrays in which the antennas are 
not movable, the GMRT (in India) and the VLBA (North America), each have a 
cluster of antennas at relatively short spacings, as well as other antennas at longer 
spacings, in order to cover a wide range of source dimensions. 

e Although the effect of sidelobes on the synthesized beam can be greatly reduced 
by CLEAN and other image-processing algorithms described in Chap. 11, 
obtaining the highest dynamic range in radio images (that is, a range of 
reliable intensity measurements of order 10° or more) requires both good spatial 
frequency coverage and effective image processing. Reducing holes (unsampled 
cells), which are found to be a consistent indicator of sidelobe levels in this 
coverage, is a primary objective in array design. 

¢ The east-west linear array has been used for both large and small instruments 
and requires tracking over +6 h to obtain full two-dimensional coverage. It is 
most useful for regions of the sky within about 60° of the celestial poles and is 
the most economical configuration with respect to land use for road or rail track. 

¢ The equiangular Y-shaped array gives the best spatial frequency coverage of the 
existing configurations with linear, open-ended arms. Autocorrelation functions 
of configurations with odd numbers of arms have higher-order symmetry than 
those with even numbers in which opposite arms are aligned. Curvature of the 
arms or random displacement of the antennas helps to smooth out the linear 
ridges in the (u, v) coverage (e.g., in the snapshot in Fig. 5.18). Such features 
are also smoothed out by hour-angle tracking and are most serious for snapshot 
observations. 

e The circle and Reuleaux triangle provide the most uniform distributions of 
measurements. With uniformly spaced antennas, the Reuleaux triangle provides 
more uniform (u, v) coverage than the circle, but varying the spacing in a quasi- 
random manner greatly improves both cases and reduces the difference between 
them; see Fig. 5.19. However, if higher resolution is needed, these configurations 
are not so easily extended as ones with open-ended arms. 


5.7 Implementation of Large Arrays 


Of the large arrays that have contributed prominently to progress in radio astronomy, 
those that developed first have largely been in the range of roughly 500 MHz to 
30 GHz, i.e., approximately the wavelength range of 1-60 cm. Examples are the 
VLA and the arrays at Westerbork (the Netherlands) and the Australia Telescope at 
Narrabri (Australia). This wavelength range is most conducive for construction of 
large parabolic reflectors with surface accuracy better than ~ 1/16 of a wavelength. 
Arrays for millimeter-wavelength observations such as the SMA on Mauna Kea 
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followed a decade or two later, as technology for more accurate surfaces developed, 
leading to ALMA on the Atacama plateau in Chile, which came into operation in 
2013 (Wootten and Thompson 2009). For the 12-m-diameter antennas of ALMA, 
the specified surface accuracy is less than 25 um, allowing useful operation up to a 
frequency of almost 1 THz. For details of measuring and adjusting the surface, see 
Mangum et al. (2006), Snel et al. (2007), and papers in Baars et al. (2009). The main 
ALMA array consists of 50 12-m-diameter antennas movable between foundation 
pads that allow a wide range of spacings up to ~ 15 km. A second, compact, array 
uses 12 7-m-diameter antennas, and 4 other antennas are available for total power 
measurements. 

At the long-wavelength end of the spectrum, radio astronomy was, for the 
first few decades, largely limited to measurements of relatively small numbers 
of the stronger sources, for example, Erickson et al. (1982). A major problem is 
presented by the ionosphere, calibration of the effects of which requires that the 
antenna elements be arranged in phased clusters, or subarrays, the beams of which 
are no wider than the aplanatic structure of the ionosphere. The outputs of these 
clusters are cross-correlated to provide the visibility values. These long-wavelength 
observations are important for the study of the most distant Universe including 
redshifted neutral hydrogen just prior to the Epoch of Reionization. In LOFAR 
[de Vos et al. (2009) and van Haarlem et al. (2013)], the clusters of dipoles have 
diameters of ~ 81 m for 10-90 MHz and ~ 40 m for 115-240 MHz. LOFAR 
is based in the Netherlands, and baselines between the clusters extend up to 1200 
km in a generally eastward direction. The dipoles take the form of an inverted V 
configuration, in which four conductors run outward and downward at an angle of 
45° from a point roughly 2 m above the ground, forming two orthogonal dipoles over 
a ground plane. Note that since the need to calibrate the effect of the ionosphere 
places a lower limit on the size of the dipole clusters that are used, in this long- 
wavelength range, large-scale arrays are generally the most successful. 


5.7.1 Low-Frequency Range 


At frequencies up to about 300 MHz, arrays of broadband dipoles mounted over a 
ground-plane reflecting screen provide a very practical antenna system. Dipoles are 
robust, and crossed dipoles provide full polarization coverage. Low-noise transistor 
front ends can operate at ambient temperature at these frequencies, where the 
system noise level is set largely by radiation from the sky. Signals from groups 
of dipoles are combined and the phases adjusted to form beams that can be pointed 
as required without the need for moving parts. If the spacing between the centers of 
the dipoles is greater than 1/2, the array is described as sparse. The collecting area 
is maximized at A?/4 per element, but because of the spacing, the grating sidelobes 
begin to be significant as 1/2 is exceeded. If the spacing is less than 1/2, the array 
is described as compact. The effective area is then less than 47/4 per element, but 
grating lobes are avoided. The variation of the path length through the ionosphere is 
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a serious problem in imaging at these low frequencies, but it is possible to calibrate 
the ionosphere over a wide angular range by forming beams in the directions of 
calibration sources for which the positions are accurately known. LOFAR and the 
Murchison Widefield Array (Lonsdale et al. 2009) and the Allen Telescope Array 
(Welch et al. 2009) are examples of this type. 

Ellingson (2005) describes a system using dipoles below 100 MHz. To achieve 
the maximum sensitivity, it is necessary only to match the antennas to the receivers 
sufficiently well that the total noise is dominated by the background component 
received by the antennas. This is an advantageous situation since it allows the 
dipoles to be used over a much wider frequency range than is possible when the 
impedance must be well matched. To investigate the performance of an inverted- 
V dipole under these conditions, let y be the power ratio of the background noise 
received from the sky to the noise contributed by the receiver. Then we have 
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where e, (< 1) is an efficiency factor that results largely from the ohmic losses in 
the ground and in the dipole, Tsky is the noise brightness temperature of the sky, Trec 
is the noise temperature of the receiver, and I is the voltage reflection coefficient at 
the antenna looking toward the receiver. I” is given by 


Zrec = Zant 
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where Zec and Zant are the impedances at the receiver and antenna terminals, 
respectively. For dominance of the sky noise, one can take y greater than ~ 10. 
Tsky is related to the intensity of the background radiation 7, (W m~? Hz! sr_!), 
by Ty = c71,/2kv*, where c is the speed of light and k is Boltzmann’s constant. 
An expression for the sky background intensity J, as a function of frequency is given 
by Dulk et al. (2001) based on measurements by Cane (1979): 
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where I = 2.48 x 107% W m~ Hz"! sr“! is the galactic component of the 


intensity, leg = 1.06 x 10 20 W m~? Hz! sr™! is the extragalactic component, and 
t(v) = 5.0v~*!. This model applies broadly over the sky except near the galactic 
plane where higher intensities are encountered. In the system described by Ellingson 
(2005), a wide frequency response for the dipoles is obtained with Zec in the range 
200-800 ohms. Computed responses indicate usable beamwidths in the range 120- 
140°. Stewart et al. (2004) describe design of an inverted-V dipole in which the 
effective width of the conducting arms is increased in one dimension, which reduces 
the impedance variation with frequency compared with that of a dipole with single- 
wire elements. 
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5.7.2 Midfrequency and Higher Ranges 


In the midfrequency range, approximately 0.3—2 GHz, there are two main possibil- 
ities. For the frequencies up to about 1 GHz, aperture arrays (van Ardenne et al. 
2009) can take the form of half-wave dipoles over a ground screen or, especially 
at the shorter wavelengths, arrays of Vivaldi antennas (Schaubert and Chio 1999) 
are used. The Vivaldi elements are formed on strips of aluminum or of copper- 
clad insulating board. By using two sets of Vivaldi elements running in orthogonal 
directions, full polarization is obtained. The approximate spacing between adjacent 
Vivaldi elements is A/2, and approximately four amplifiers are required for each 
square wavelength of collecting area, e.g., ~ 44 amplifiers per square meter at 
1 GHz. Aperture arrays provide multiple beams with rapid and flexible pointing. 


5.7.2.1 Phased-Array Feeds 


For the range from ~ 700 MHz and above, parabolic dish-type antennas with single 
or multiple beams become more practicable than aperture arrays since, for a given 
collecting area, they do not require such large numbers of low-noise amplifiers and 
phasing components. With feeds in the form of a focal-plane array, i.e., an array of 
individual feed elements in the focal plane of an antenna, it is usually not possible 
to get the feeds close enough together to avoid gaps between the individual beams. 
Thus, it is often preferable to use phased-array feeds in which an array of closely 
spaced receiving elements is arranged in the focal plane. Any one antenna beam is 
formed as a phased combination of the signals from a number of the feed elements, 
and such combinations can be designed to provide optimum beam spacings for 
efficient sky coverage. It is the beamformer that distinguishes the phased-array feed 
from the focal-plane array. The elements are individually terminated with matched 
amplifiers, but mutual coupling between the elements cannot be avoided, so the 
design and adjustment of phased-array feeds is generally more critical than for focal- 
plane arrays. A general analysis of a phased-array feed can be found in van Ardenne 
et al. (2009) and Roshi and Fisher (2016). 

Designs of phased-array feeds include ones using the Vivaldi system mentioned 
above and others using a “checkerboard” conductor pattern (Hay et al. 2007). The 
checkerboard scheme can be envisaged as a series of conducting elements on a 
circuit board that are arranged like the black squares of a checkerboard. At each 
point where two corners of conducting squares meet, the corners do not touch, 
but each feeds one input of a balanced amplifier. The patterns of conducting and 
nonconducting surfaces are identical and thus self-complimentary. A screen of this 
form in free space is well matched with load impedances of 377 ohms between 
the corner pairs of conducting squares where the amplifiers are connected.” For use 


2This follows from a formula by Booker: see, e.g., Antennas, J. D. Kraus (1950 or later edition). 
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as a feed array, the checkerboard screen is mounted over a ground plane, which 
introduces some frequency variation in the impedance. In this frequency range, the 
input stages of amplifiers at the feeds may be cryogenically cooled to minimize the 
system temperature. 

The use of phased-array feeds in interferometric arrays presents a huge challenge 
in signal processing because separate correlators are required for each beam. The 
first interferometer to be designed specifically for phased-array feed technology is 
ASKAP at the Murchison Radio Observatory. The system has 36 dual-polarized 
beams operating in the 0.7—1.8 GHz band (Hay et al. 2007; Hotan et al. 2014). A 52- 
element phased array called APERTIF at 21-cm wavelength has been implemented 
on the Westerbork telescope (van Cappellen and Bakker 2010; van Cappellen et al. 
2011; and Ivashina et al. 2011). 


5.7.2.2 Optimum Antenna Size 


An array with fixed collecting area can be built with a large number of small 
antennas (called the “large N, small d solution”) or a small number of large antennas 
(the “small n, large D solution”). Determining the right antenna size is a complex 
problem. With smaller antennas, the field of view is larger, which enhances survey 
speed, but with larger antennas, phase calibration sources can be found closer to the 
target. 

A cost analysis is an important element in the determination of antenna size. The 
critical fact in cost optimization is that the cost of parabolic antenna elements of 
diameter D scales approximately as D?” (Meinel 1979). Because the exponent on D 
is greater than two, the total cost of the antennas in an array increases with diameter 
for a fixed array area. On the other hand, a larger array of smaller antennas requires 
more receivers and a larger correlator. A crude cost model can be written 


C + finaD’ + fona + fanz , (5.25) 


where na is the number of antennas, fı is the antenna cost factor, f is the receiver 
cost factor, and f3 is the correlator cost factor, where we assume the correlator cost 
scales as n2. For a fixed array collecting area, A, 


A 
Ng = anD (5.26) 
where ņ is the aperture efficiency. We can substitute Eq. (5.26) into Eq. (5.25) and 
find the value of D that minimizes C. These values of D are typically in the range of 
4 to 20 m. The proposals for the antenna sizes for ALMA ranged in diameter from 
6 to 15 m before the decision was made for 12-m-diameter elements, based on cost 
and many other factors. 
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5.7.3 Development of Extremely Large Arrays 


The concept of an array with a collecting area of ~ 1 square kilometer arose 
in the late 1990s after the Westerbork Synthesis Radio Telescope, the VLA, and 
similar instruments had demonstrated the power of the synthesis technique in high- 
resolution imaging and in cataloging and studying large numbers of sources. Such 
an array would have a collecting area of about two orders of magnitude greater than 
existing arrays at that time but would require significant technological development 
to be financially feasible. An initial objective was to extend the redshift range at 
which HI in galaxies can be studied by an order of magnitude to z ~ 2. The concept 
has been developed into a plan to build multiple arrays spanning the frequency 
interval of 70 MHz to greater than 25 GHz, with baselines up to about 5000 km. 
This instrument, collectively called the Square Kilometre Array (SKA)? would 
have an enormous impact on a broad range of astronomical problems from planet 
formation to cosmology. The science case for the instrument has been presented 
by Carilli and Rawlings (2004) and Bourke et al. (2015). Technical details are 
given in Hall (2004) and Dewdney et al. (2009). The concept of such an array 
has led to the development of several smaller arrays to test the practicality and 
performance of possible technologies, including antenna and correlator designs. 
These include ASKAP, with 12-m-diameter antennas with a checkerboard phased- 
array feed system providing multiple beams (see Sect. 5.7.2.1), located in Western 
Australia (DeBoer et al. 2009), and MeerKAT, an array of low-cost 12-m-diameter 
dish antennas with single-pixel feeds to cover 0.7—-10 GHz, located in the Karoo 
region of South Africa (Jonas 2009). 


5.7.4 The Direct Fourier Transform Telescope 


The normal practice in radio astronomy is to measure the correlation function of 
the incident electric field and then take its Fourier transform to obtain the image of 
the source intensity distribution. An alternative approach is to measure the Fourier 
transform of the incident electric field with a uniform array of antennas and take 
its square modulus to obtain the image. Either the correlation function or the direct 
Fourier transform approach must be implemented at the Nyquist rate appropriate for 
the bandwidth. The latter approach is simply an implementation of the Fraunhofer 
diffraction equation, which relates the aperture field distribution to the far field 
distribution (see Chap. 15). For this reason, instruments based on this method are 
sometimes called digital lenses. The Fraunhofer equation is also the basis of the 
holographic method of measuring the surface accuracy of parabolic antennas, as 
described in Sect. 17.3. 


3The SKA Memo Series can be found at http://www.skatelescope.org/publications. 


200 5 Antennas and Arrays 


Daishido et al. (1984) described the operation and prototype of a direct Fourier 
transform telescope operating at 11 GHz. They called the instrument a “phased 
array telescope” because its operation was equivalent to forming phased array beams 
pointed at a grid of positions on the sky. The Fourier transform was affected though 
the use of Butler matrices. A 64-element array (8 x 8 elements on a uniform grid) 
was built at Waseda University and used for wide-field searches of transient sources 
(Nakajima et al. 1992, 1993; Otobe et al. 1994). The signal processing was further 
improved in another instrument aimed at pulsar observations (Daishido et al. 2000; 
Takeuchi et al. 2005). 

Interest has been renewed in the direct Fourier transform telescope because of the 
advent of arrays with very large numbers of antennas. In this case, the direct Fourier 
transform configuration can take advantage of the computational speed of the fast 
Fourier transform, which scales as na logy na, where na is the number of antennas. 
A detailed analysis of the direct Fourier transform telescope was developed by 
Tegmark and Zaldarriaga (2009, 2010). They were motived by the challenges of 
measuring the wide-field distribution of redshifted HI emission, the signature of the 
Epoch of Reionization (see Sect. 10.7.2), and called their instrument the Fast Fourier 
Transform Telescope. Zheng et al. (2014) built a prototype 8 x 8 array at 150 MHz 
to develop techniques for such measurements. 

One characteristic of the direct Fourier transform telescope based on the FFT 
with a uniform-grid antenna layout is the high redundancy of short baselines. The 
situation is similar to that encountered in the design of the digital FFT spectrometers 
described in Sect. 8.8.5, wherein the number of equivalent baselines at large 
spacings is underrepresented. Methods of relaxing the requirement of uniform 
spacings have been explored by Tegmark and Zaldarriaga (2010) and Morales 
(2011). 

A disadvantage of the direct Fourier transform telescope relates to calibration. 
Since no baseline-based measurements are made, the traditional techniques of 
self-calibration based on amplitude and phase closure cannot be directly applied. 
There are several approaches to the calibration problem. The most straightforward 
approach is to transform the images back to the visibility domain on the time scale 
of instrumental and atmospheric variability, and apply the techniques described 
in Chap. 11. Auxiliary measurements can also be made to supply calibration 
information. More sophisticated methods are under development (e.g., Foster et al. 
2014; Beardsley et al. 2016). 
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Chapter 6 
Response of the Receiving System 


This chapter is concerned with the response of the receiving system that accepts 
the signals from the antennas, amplifies and filters them, and measures the cross- 
correlations for the various antenna pairs. We show how the basic parameters of 
the system affect the output. Some of the effects were introduced in earlier chapters 
and are here presented in a more detailed development that leads to consideration 
of system design in Chaps. 7 and 8. At some point in the processing chain between 
the antenna and the correlator output, the form of the signals is changed from an 
analog voltage to a digital format, and the resulting data are thereafter processed 
by computer-type hardware. This does not affect the mathematical analysis of the 
processing and is not considered in this chapter. However, the digitization introduces 
a component of quantization noise, which is analyzed in Chap. 8. 


6.1 Frequency Conversion, Fringe Rotation, and Complex 
Correlators 


6.1.1 Frequency Conversion 


With the exception of some systems operating below ~ 100 MHz, in most radio 
astronomy instruments, the frequencies of the signals received at the antennas are 
changed by mixing with a local oscillator (LO) signal. This feature, referred to 
as frequency conversion or (heterodyne frequency conversion), enables the major 
part of the signal processing to be performed at intermediate frequencies that are 
most appropriate for amplification, transmission, filtering, delaying, recording, and 
similar processes. For observations at frequencies up to roughly 50 GHz, the best 
sensitivity is generally obtained by using a low-noise amplifying stage before the 
frequency conversion. 
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Fig. 6.1 Frequency conversion in a radio receiving system. (a) Simplified diagram of a mixer and a 
filter H that defines the intermediate-frequency (IF) band. The nonlinear element shown is a diode. 
(b) Signal spectrum showing upper and lower sidebands that are converted to the IF. Frequency vo 
is the center of the IF band. 


Frequency conversion takes place in a mixer in which the signal to be converted, 
plus an LO waveform, are applied to a circuit element with a nonlinear voltage— 
current response. This element may be a diode, as shown in Fig. 6.1a. The current i 
through the diode can be expressed as a power series in the applied voltage V: 


i=ataVtavVv+ave+-:: ‘ (6.1) 


Now let V consist of the sum of an LO voltage bı cos(27 vtot + OL0) and a signal, 
of which one Fourier component is bz cos(27 vst + s). The second-order term in V 
then gives rise to a product in the mixer output of the form 


bı cos(27vypot + 0) 


1 
x by cos(2m vst + ds) = z’ cos [2x (vs + vLo)t + $s + Aol] (6.2) 
1 
+ pa cos [27 (vs — vro)t + ds — Oto] . 


Thus, the current through the diode contains components at the sum and difference 
of v, and vro. Other terms in Eq. (6.1) lead to other components, such as 3vL9 + vs, 
but the filter H shown in Fig. 6.1 passes only the wanted output components, and 
with proper design, unwanted combinations can be prevented from falling within 
the filter passband. Usually the signal voltage is much smaller than the LO voltage, 
so harmonics and intermodulation products (i.e., spurious signals that arise as a 
result of cross products of different frequency components within the input signal 
band) are small compared with the wanted terms containing vro. 

In most cases of frequency conversion, the signal frequency is being reduced, and 
the second term on the right side in Eq. (6.2) is the important one. The filter H then 
defines an intermediate-frequency (IF) band centered on vo, as shown in Fig. 6. 1b. 
Signals from within the bands centered on vio — vo and vto + vo are converted to the 
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IF band and admitted by the filter. These bands are known as the lower and upper 
sidebands, as shown, and if only a single sideband is wanted, the other can often be 
removed by a suitable filter inserted before the mixer. In some cases, both sidebands 
are accepted, resulting in a double-sideband response. 


6.1.2 Response of a Single-Sideband System 


Figure 6.2 shows a basic receiving system for two antennas, m and n, of a synthesis 
array. Here, we are interested in further effects of frequency conversion. The time 
difference t, between the arrival at the antennas of the signals from a radio source 
varies continuously as the Earth rotates and the antennas track the source across the 
sky. A variable instrumental delay q; is continuously adjusted to compensate for the 


Frequency = rig +» 
QO) Mixer ad =? 


Frequency = » 
Hnl») 
mplifier 


Variable 
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| 
Integrator 


Output 


Fig. 6.2 Basic receiving system for two antennas of a synthesis array. The variable delay 1; is 
continuously adjusted under computer control to compensate for the geometric delay t,. The 
frequency response functions H,,(v) and H,,(v) represent the overall bandpass characteristics of 
the amplifiers and filters in the signal channels. 
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geometric delay Tg, so that the signals arrive simultaneously at the correlator. The 
receiving channels through which the signals pass contain amplifiers and filters, the 
overall amplitude (voltage) responses of which are H,,(v) and H,,(v) for antennas 
m and n. Here, v represents a frequency at the correlator input; the corresponding 
frequency at the antenna is vto + v. The voltage waveforms that are processed 
by the receiving system result from cosmic noise and system noise; we consider 
the usual case in which these processes are approximately constant across the 
receiver passband. The spectra at the correlator inputs are thus determined mainly 
by the response of the receiving system. Let ¢,, be the phase change in the signal 
path through antenna m resulting from T and the LO phase, and let ¢, be the 
corresponding phase change in the signal for the path through antenna n, including 
Ti. Ọm and ¢,, together with the instrumental phase resulting from the amplifiers and 
filters, represent the phases of the cosmic signal at the correlator inputs. Negative 
values of these parameters indicate phase lag (signal delay). The response to a 
source for which the visibility is V(u,v) = |Vle/% is most easily obtained by 
returning to Eq. (3.5) and replacing the phase difference 27 D; + Sọ by the general 
term n — $m. Then the response at the correlator output resulting from a frequency 
band of width dv can be written as 


dr = Re {Ao| V| Hn (v) Hp (v) ern Fo) dp}, (6.3) 


where ¢, is the visibility phase. The response from the full system passband is 
OO + 
r=Re fav / Hn(v) He (v) ef PxFm—P dy, (6.4) 
—0o 


where we have included both positive and negative frequencies! in the integral 
and assumed that V does not vary significantly over the observing bandwidth. 
Equation (6.4) represents the real part of the complex cross-correlation, and the 
way to obtain both the real and imaginary parts is explained later in this section. 


6.1.3 Upper-Sideband Reception 


For upper-sideband reception, a filter or amplifier at the receiver input selects 
frequencies in a band defined by the correlator input spectrum (frequency v) plus 
vio. In Fig. 6.2, the signal entering antenna m traverses the geometric delay Tg at a 
frequency vto + v and thus suffers a phase shift 27(v_o + v)t,. At the mixer, its 


'The negative frequencies have no physical meaning but arise as part of the mathematical 
representation of the frequency conversion. 
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phase is also decreased by the LO phase 6,,. Thus, we obtain 
om(v) = —27 (vro + V)Tg — Om. (6.5) 


The phase of the signal entering antenna n is decreased by the LO phase 6,,, and the 
signal then traverses the instrumental delay q; at a frequency v, thus suffering a shift 
21 vT;. The total phase shift for antenna n is 


Pn) = 27V; — On . (6.6) 


From Eqs. (6.4), (6.5), and (6.6), the correlator output is 
Tu = Re } Ao|V |e P7 20% + (Om—On)— br] f H,(v)H*(v) e”™^dvh (6.7) 
—o0o 


The real part of the integral in Eq. (6.7) is one-half the Fourier transform of 
the (Hermitian) cross power spectrum H,,(v)H*(v) with respect to the delay 
compensation error, At = Tg — T;, which introduces a linear phase slope across the 
band.” We assume that V does not vary significantly over the observing bandwidth. 
For example, if the IF passbands are rectangular with center frequency vo, width 
Avr, and identical phase responses, then for positive frequencies, 


Avr 
Ho, |v — vol < 3 
\Ain(v)| T |H,(v)| = A (6.8) 
VIF 
0, |v — vo| > 5 


Using the equality in Eq. (A3.6) of Appendix 3.1 for the Hermitian? function HHn, 
we can write 


CO 
J Hy (v) HZ (v) e74 dv = 2Re 
—oo o—(AvF/2) 


vot (Avir/2) ; 
f H? erat dy 
v 


sin(x Avr At) 


= 2H% A 
a ve wAVpATt 


| Cos 2A voAT . 
(6.9) 


?Here, we assume that the source is sufficiently close to the center of the field being imaged that 
the condition At = 0 maintains zero delay error. The effect of the variation of the delay error 
across a wider field of view is considered in Sect. 6.3. 

3The term “Hermitian” indicates a function in which the real part is even and the imaginary part is 
odd. 
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In the general case, we define an instrumental gain factor Gmn = |Ginn|e/?6 as 
follows: 


oo 
Ao | Hy (v)H; v) el2mvAt dv = Gmn (4T) e27 voAT 
—oo 
= |Gm (AT) [ei 2747 +40) , (6.10) 


where the At dependence in |G,,(AT)| is the sinc function in Eq. (6.9). The phase 
$g results from the difference in the phase responses of the amplifiers and filters 
in the two channels. The LO phases 6, and 9, are not included within the general 
instrumental phase term ¢@g because they enter into the upper and lower sidebands 
with different signs. 

Substituting Eq. (6.10) into Eq. (6.7), we obtain for upper-sideband reception 


Ty = IV||Ginn(AT)| cos [2x (Vote + voAT) + (On = On) = dy + $c] . (6.11) 


The term 27 vLoTg in the cosine function results in a quasi-sinusoidal oscillation as 
the source moves through the fringe pattern. The phase of this oscillation depends 
on the delay error Art, the relative phases of the LO signals, the phase responses 
of the signal channels, and the phase of the visibility function. The frequency of 
the output oscillation v_odt,/dt is often referred to as the natural fringe frequency. 
The oscillations result because the signals traverse the delays t, and q; at different 
frequencies, that is, at the input radio frequency for t, and at the intermediate 
frequency for t;, and these two frequencies differ by vio. Thus, even if these 
two delays are identical, they introduce different phase shifts, and they increase 
or decrease progressively as the Earth rotates. 


6.1.4 Lower-Sideband Reception 


Consider now the situation where the frequencies accepted from the antenna are 
those in the lower sideband, at vio minus the correlator input frequencies. The 
phases are 


dm = 2 (VLO — V) Te + Om (6.12) 
and 
bn = 2n vt; + On. (6.13) 


The signs of these terms and of ¢, differ from those in the upper-sideband case 
because increasing the phase of the signal at the antenna here decreases the phase at 
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the correlator. The expression for the correlator output is 
re = Re} Ag| Vie IP Ot + Gm Pn) $0] f Hm(v)H* (v) edv} . (6.14) 
—oo 


Proceeding as in the upper-sideband case, we obtain 


r= IV||Ginn(At)| cos [22 (vLote = vo AT) + (On == On) = do = $c] : (6.15) 


6.1.5 Multiple Frequency Conversions 


In an operational system, the signals may undergo several frequency conversions 
between the antennas and the correlators. A frequency conversion in which the 
output is at the lower sideband (i.e., the LO frequency minus the input frequency) 
results in a reversal of the signal spectrum in which frequencies at the high end at the 
input appear at the low end at the output, and vice versa. If there is no net reversal 
(that is, an even number of lower-sideband conversions), Eq. (6.11) applies, except 
that vtro must be replaced by a corresponding combination of LO frequencies. 
Similarly, the oscillator phase terms Ôm and 6, are replaced by corresponding 
combinations of oscillator phases. 


6.1.6 Delay Tracking and Fringe Rotation 


Adjustment of the compensating delay qt; of Fig. 6.2 is usually accomplished under 
computer control, the required delay being a function of the antenna positions and 
the position of the phase center of the field under observation. This can be achieved 
by designating one antenna of the array as the delay reference and adjusting the 
instrumental delays of other antennas so that, for an incoming wavefront from the 
phase reference direction, the signals intercepted by the different antennas all arrive 
at the correlator simultaneously. 

To control the frequency of the sinusoidal fringe variations in the correlator 
output, a continuous phase change can be inserted into one of the LO signals. 
Equations (6.11) and (6.15) show that the fringe frequency can be reduced to zero 
by causing 6m — @, to vary at a rate that maintains constant, modulo 27, the term 
[27 vot, + (Om — O,)]. This requires adding a frequency 27v_o(dt,/dt) to 6, or 
subtracting it from 6,,. Note that dt,/dt can be evaluated from Eq. (4.9) in which 
w, the third component of the interferometer baseline, is equal to ct, measured in 
wavelengths; for example, for an east-west antenna spacing of | km, the maximum 
value of dt,/dt is 2.42 x 10710, so the fringe frequencies are generally small 
compared with the radio frequencies involved. Reduction of the output frequency 
reduces the quantity of data to be processed, since each correlator output must 
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be sampled at least twice per cycle of the output frequency (the Nyquist rate) 
to preserve the information, as is discussed in Sect. 8.2.1. With antenna spacings 
required for angular resolution of milliarcsecond order, which occur in VLBI, the 
natural fringe frequency, v_odt,/dt, can exceed 10 kHz. For an array with more 
than one antenna pair, it is possible to reduce each output frequency to the same 
fraction of its natural frequency, or to zero. Reduction to zero frequency (fringe 
stopping) is generally the preferred practice. Some special technique, such as the 
use of a complex correlator, described in the following section, is then required to 
extract the amplitude and phase of the output. 


6.1.7 Simple and Complex Correlators 


A method of measuring the amplitude and phase of the correlator output signal when 
the fringe frequency is reduced to zero is shown in Fig. 6.3. Two correlators are used, 
one of which has a quadrature phase shift network in one input. For signals of finite 
bandwidth, this phase shift is not equivalent to a delay. The phase shift can also be 
effected by feeding the signal into two separate mixers and converting it with two 
LOs in phase quadrature. The output of the second correlator can be represented 
by replacing H,,(v) by H,(v)e~”/?. From Eq. (6.10), the result is to add —7/2 to 
og, and thus in Eq. (6.11) and Eq. (6.15), the cosine function is replaced by sine. 
Another way of comparing the two correlator outputs in Fig. 6.3 is to note that the 
real output of the complex correlator, omitting constant factors, is 


Treal = Re vf HOHO) dv} = Rev} [ Hy (v)HF (v) dv , (6.16) 
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Fig. 6.3 Use of two correlators to measure the real and imaginary parts of the visibility. This 
system is called a complex correlator. 
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where the integral is real since H,,(v) and H*(v) are Hermitian and thus 
Hm(v)Hž (v) is Hermitian. The imaginary output of the correlator is proportional to 


Timag = Re vf Hy (v)H* (vje? a z= Imn f Hn(v)H; (v) dv . 
= Bi (6.17) 


Thus, the two outputs respond to the real and imaginary parts of the visibility V. 

The combination of two correlators and the quadrature network is usually 
referred to as a complex correlator, and the two outputs as the cosine and sine, 
or real and imaginary, outputs. For continuum observations, the compensating 
delay is adjusted so that At = 0 and the fringe rotation maintains the condition 
2m VLOT; + (Om — On) = 0. Thus, the cosine and sine outputs represent the real 
and imaginary parts of Gmn V (u, v). With the use of the complex correlator, the 
rotation of the Earth, which sweeps the fringe pattern across the source, is no longer 
a necessary feature in the measurement of visibility. An important feature of the 
complex correlator is that the noise fluctuations in the cosine and sine outputs are 
independent, as discussed in Sect. 6.2.2. 

Spectral correlator systems, in which a number of correlators are used to measure 
the correlation as a function of time offset or “lag” [i.e., t in Eq. (3.27)], are 
discussed in Sect. 8.8. The correlation as a function of t measured using a correlator 
with a quadrature phase shift in one input is the Hilbert transform of the same 
quantity measured without the quadrature phase shift (Lo et al. 1984). 


6.1.8 Response of a Double-Sideband System 


A double-sideband (DSB) receiving system is one in which both the upper- and 
lower-sideband responses are accepted. From Eqs. (6.11) and (6.15), the output is 


ra = ru + re = 2|V||Gm(AT)|cos(2 avot + ġo) 
X COS [27 vote + (On — On) — dv | : (6.18) 


There is a significant difference from the single-sideband (SSB) cases. The phase 
of the fringe-frequency term, which is the cosine function containing the term 


2 VLOTe, is no longer dependent on At or ġg, but instead these quantities appear 
in the term that controls the fringe amplitude: 


|Ginn(AT)| cos(27 Vp At + ġo). (6.19) 


If the delay q; is held constant, At varies continuously, resulting in cosinusoidal 
modulation of the fringe oscillations through the cosine term in (6.19). Also, as 
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Fig. 6.4 Example of the variation of the fringe amplitude as a function of At for a DSB system 
(solid line). In this case, the centers of the two sidebands are separated by three times the IF 
bandwidth, that is, vo = 1.5Avyp, and the IF response is rectangular. The broken line shows the 
equivalent function for an SSB system with the same IF response. 


shown in Fig. 6.4, the cross-correlation (fringe amplitude) falls off more rapidly 
because of the cosine term in (6.19) than it does in the SSB case, in which it 
depends only on G,,,(At). The required precision in matching the geometric and 
instrumental delays is correspondingly increased. The lack of dependence of the 
fringe phase on the phase response of the signal channel occurs because the latter 
has equal and opposite effects on the signals from the two sidebands. 

The response of a DSB system with a complex correlator is given by Eq. (6.18) 
for the cosine output, and the sine output is obtained by replacing ¢g by g — 1/2: 


(‘a)sine = 2|V]||Gin(AT)| sin(2avp At + dG) 
xX cos [27 Vote + (On — On) — bv | . (6.20) 


If the term 27 vo At + dg is adjusted to maximize either the real output [Eq. (6.18)] 
or the imaginary output [Eq. (6.20)], the other will be zero. Thus, for continuum 
observations in which the signal is of equal strength in both sidebands, the complex 
correlator offers no increase in sensitivity. However, it can be useful for observations 
in the sideband-separation mode described later. 

To help visualize the difference between SSB and DSB interferometer systems, 
Fig. 6.5 illustrates the correlator outputs in the complex plane. The SSB case is 
shown in Fig. 6.5a. The output of the complex correlator is represented by the vector 
r. If the fringes are not stopped, the vector r rotates through 27 each time the 
geometric delay t, changes by one wavelength (that is, one wavelength at the LO 
frequency if the instrumental delay is tracking the geometric delay). The projections 
of the radial vector on the real and imaginary axes indicate the real and imaginary 
outputs of the complex correlator, which are two fringe-frequency sinusoids in phase 
quadrature. If the fringes are stopped, r remains fixed in position angle. Figure 6.5b 
represents the DSB case. Vectors r, and rg represent the output components from 
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Fig. 6.5 Representation in the complex plane of the output of a correlator with (a) an SSB and (b) 
a DSB receiving system. The point C in (b) represents the sum of the upper- and lower-sideband 
outputs of the correlator. 


the upper and lower sidebands. Here the variation of t, causes r, and rẹ to rotate in 
opposite directions. To verify this statement, note that the real parts of the correlator 
output are given in Eqs. (6.11) and (6.15), and the corresponding imaginary parts 
are obtained by replacing ¢g by ġo — 2/2. Then with (Om — 0n) = 0 (no fringe 
rotation), consider the effect of a small change in Ty. 

The contra-rotating vectors representing the two sidebands at the correlator 
output coincide at an angle determined by instrumental phase, which we represent 
by the line AB in Fig. 6.5b. Thus, the vector sum oscillates along this line, and the 
fringe-frequency sinusoids at the real and imaginary outputs of the correlator are 
in phase. Now suppose we adjust the phase term (27vpAt + dg) in Eq. (6.18) 
to maximize the fringe amplitude at the real output. This action has the effect of 
rotating the line AB to coincide with the real axis. The imaginary output of the 
complex correlator then contains no signal, only noise. From Eq. (6.18), it can 
be seen that the visibility phase $, is represented by the phase of the vector that 
oscillates in amplitude along the real axis. The phase can be recovered by letting the 
fringes run and fitting a sinusoid to the waveform at the real output. If the fringes are 
stopped, it is possible to determine the amplitude and phase of the fringes by 1/2 
switching of the LO phase at one antenna. In Eq. (6.18), this phase switch action can 
be represented by Om —> (Om — 2/2), which results in a change of the second cosine 
function to a sine, thus enabling the argument in square brackets to be determined. 
However, in such a case, the data representing the cosine and sine components of 
the output are not measured simultaneously, so the effective data-averaging time is 
half that for the SSB, complex-correlator case. In Fig. 6.5b, a 7/2 switch of the LO 
phase results in a rotation of r, and rg by 2/2 in opposite directions, so the vector 
sum of the two sideband outputs remains on the line AB. Relative sensitivities of 
different systems are discussed in Sect. 6.2.5. 
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Fig. 6.6 Receiving system for two antennas that incorporates two frequency conversions, the first 
being DSB and the second upper-sideband. Two compensating delays, t; and Tj, are included 
so that in deriving the response for a DSB system, the effect of the position of the delay relative 
to the first mixer can be investigated. In practice, only one compensating delay is required. The 
overall frequency responses H, and H, are specified as functions of v, which is the corresponding 
frequency at the correlator input. 
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6.1.9 Double-Sideband System with Multiple Frequency 
Conversions 


The response with multiple frequency conversions is more complicated for a DSB 
interferometer than for an SSB one and is illustrated by considering the system in 
Fig. 6.6. Note that for the case in which the IF signal undergoes a number of SSB 
frequency conversions after the first mixer, the second mixer of each antenna in 
Fig. 6.6 can be considered to represent several mixers in series, and v2 is equal to the 
sum of the LO frequencies with appropriate signs to take account of upper- or lower- 
sideband conversions. The signal phase terms are determined by considerations 
similar to those described in the derivation of Eqs. (6.5) and (6.6). Thus, we obtain 


om = F 2r (vi + v £ v)tg F Om — On2 (6.21) 
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and 
Pn = —21(v2 + V)Ti1 — 2nvT2 F An) — Om , (6.22) 


where the upper signs correspond to upper-sideband conversion at both the first and 
second mixers for each antenna, and the lower signs to lower-sideband conversion 
at the first mixer for each antenna and upper-sideband conversion at the second. 
We then proceed as in the previous examples; that is, use Eqs. (6.21) and (6.22) to 
substitute for m and @, in Eq. (6.4), separate out the integral of H,,H* with respect 
to frequency, v, as in Eq. (6.7), and substitute for the integral using Eq. (6.10). The 
results are 


Tu = |V||Gnn(AT)| cos[27 vt, + 27 V2(T, — Ti) + 2r voAT 
+ (Ont E Oni) + (Anz a On2) = dy F $c] (6.23) 


and 


re = |V||Gin(At)| cos[27 v1 t, — 2 v2(Tg — Ta) — 2m vo AT 


+ (Omi _ Ont) = (0m2 = On2) = dy _ dc] . (6.24) 


The DSB response is 


ra = fu + re 
= 2|V||Gnn(AT)| cos {27 [v2(ta —T,) — vo At] — (Om2 — n2) — bo} 
x cos [vite + (Omi — On) — h] ; (6.25) 


where At = Tg — Ti — Ti. Note that the phase of the output fringe pattern, given 
by the second cosine term, depends only on the phase of the first LO. Thus, in the 
implementation of fringe rotation, the phase shift must be applied to this oscillator. 
The first cosine term in Eq. (6.25) affects the fringe amplitude, and two cases should 
be considered: 


1. The delay t}, at the IF immediately following the DSB mixer, is used as the 
compensating delay, and t; = 0. Then in the first cosine function in Eq. (6.25), 
Tii — Tg œ 0, and ġg should be small if the frequency responses of the two 
channels are similar. It is necessary only to equalize 6m2 and 0,2. to maximize the 
amplitude of the fringe-frequency term. This is similar to the single conversion 
case in Eq. (6.18). 

2. The delay t;2, located after the last mixer, is used as the compensating delay, and 
Tj, = 0. (This is the case in any array in which the compensating delays are 
implemented digitally, which includes almost all currently operational systems.) 
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Then a continuously varying phase shift is required in 0m2 or 0,2 of Eq. (6.25) to 
keep the value of the first cosine function close to unity as t, varies. This phase 
shift does not affect the phase of the output fringe oscillations, only the amplitude 
[see, e.g., Wright et al. (1973)]. 


6.1.10 Fringe Stopping in a Double-Sideband System 


Consider two antennas of an array as shown in Fig. 6.6 and the case in which the 
instrumental delay that compensates for t, is the one immediately preceding the 
correlator, so that ta = 0. One can think of interferometer fringes as being caused 
by a Doppler shift in the signal at one antenna, which results in a beat frequency 
when the signals are combined in the correlator. Suppose that the geometric delay, 
Tg, in the signal path to antenna m (on the left side of the diagram) is increasing 
with time, that is, antenna m is moving away from the source relative to antenna n. 
Then a signal at frequency vpr at the wavefront from a source appears at frequency 
Vpr(1 — dt,/dt) when received at antenna m. If the signal is in the upper sideband, 
its frequency at the correlator input will be 


d 
ve ( x Te) ae (6.26) 


To stop the fringes, we need to apply a corresponding decrease to the frequency 
of the signal from antenna n so that the signals arrive at the correlator at the same 
frequency. To do this, we increase the frequencies of the two LOs for antenna n by 
the factor (1+dt,/dt). Note that this is equivalent to adding 27 (dt,/dt)v to @,, and 
21 (dt,/dt)v2 to 6,2, which are the rates of change of the oscillator phases required 
to maintain each of the two cosine functions in Eq. (6.25) at constant value. The 
corresponding signal from antenna n traverses the delay T; at a frequency vrf — 
(vı + v2)(1 + dt,/dt), and since the delay is continuously adjusted to equal Tg, 
the signal suffers a reduction in frequency by a factor (1 — dt,/dt). Thus, at the 
correlator input, the frequency of the antenna-n signal is 


d d 
E = (vı + v2) (1 $ =) ( -— “) (6.27) 


which is equal to (6.26) when second-order terms in dt,/dt are neglected. (Recall 
that for, e.g., a 1-km baseline, the highest possible value of dt,/dt is 2.42 x 107!°.) 
For the lower sideband, (6.26) and (6.27) apply if the signs of both vpyp and vı are 
reversed, and again the frequencies at the correlator input are equal. Thus, the overall 
effect is that the fringes are stopped for both sidebands. 
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6.1.11 Relative Advantages of Double- and Single-Sideband 
Systems 


The principal reason for using DSB reception in interferometry is that in certain 
cases, the lowest receiver noise temperatures are obtained by using input stages 
that are inherently DSB devices. As frequency increases above ~ 100 GHz, it 
becomes increasingly difficult to make low-noise amplifiers, and receiving systems 
often use a mixer of the superconductor—insulator—superconductor (SIS) type [see, 
e.g., Tucker and Feldman (1985)] as the input stage followed by a low-noise IF 
amplifier. Both the mixer and the IF amplifier are cryogenically cooled to obtain 
superconductivity in the mixer and to minimize the amplifier noise. If a filter is 
placed between the antenna and the mixer to cut out one sideband, the received 
signal power is halved, but there is no reduction in the receiver noise generated in 
the mixer and IF stages. Thus, the signal-to-noise ratio (SNR) in the IF stages is 
reduced, and in this case, the best continuum sensitivity may be obtained if both 
sidebands are retained. As a historical note, DSB systems were used at centimeter 
wavelengths during the 1960s and early 1970s [see, e.g., Read (1961)], sometimes 
with a degenerate type of parametric amplifier as the low-noise input stage. These 
amplifiers were inherently DSB devices, and their use in interferometry is discussed 
by Vander Vorst and Colvin (1966). 

DSB systems have a number of disadvantages. Increased accuracy of delay 
setting is required, frequency and phase adjustment on more than one LO is likely 
to be required, interpretation of spectral line data is complicated if there are lines in 
both sidebands, and the width of the interference-free spectrum required is doubled. 
Also, the smearing effect of a finite bandwidth, to be discussed in Sect. 6.3, is 
increased. These problems have stimulated the development of schemes by which 
the responses for upper and lower sidebands can be separated. 


6.1.12 Sideband Separation 


To illustrate the method by which the responses for the two sidebands can be 
separated at the correlator output of a DSB receiving system, we examine the sum 
of the upper- and lower-sideband responses from Eqs. (6.11) and (6.15). This is 


ra = tut re = |V||Ginn(AT)| {cos [27 (Vote + vo AT) + Onn — by + $c] 
+ cos [27 (Vote — WAT) + Onn — by — ch} ; (6.28) 


where Omn = On — On. Equation (6.28) represents the real output of a complex 
correlator. We rewrite Eq. (6.28) as 


ra = |V||Ginn|(cos Wu + cos We) , (6.29) 
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where W, and We represent the corresponding expressions in square brackets in 
Eq. (6.28). The responses considered above represent the normal output of the 
interferometer, which we call condition |. The expression for the imaginary output 
of the correlator is obtained by replacing ¢g by ¢g — 7/2. Consider a second 
condition in which a 2/2 phase shift is introduced into the first LO signal of antenna 
m, so that Omn becomes Omn — 2/2. The correlator outputs for the two conditions are 
obtained from Eqs. (6.28) and (6.29): 


condition 1 (6.30) 
ri = |V||Gin| (cos Yy + cos We) 

r2 = |V||Gmn| (sin Y, — sin Ye) 

condition? (Omn —> Omn — 1/2) (6.31) 
r3 = |V||Grm| (sin Y, + sin We) 

r4 = |V||Gmn|(— cos W, + cos We) 


where rı and r3 represent the real outputs of the correlator and rz and r4 the imagi- 
nary outputs. Thus, the upper-sideband response, expressed in complex form, is 


1 
IV||Gin|(cos Y, + j sin Y,) = 5 [(ri — r4) + j(r2 + r3)] . (6.32) 


Similarly, the lower-sideband response is 
i 1 , 
IV||Ginn|(cos We + j sin We) = 3 [(r1 + r4) —j(r2 — r3)] - (6.33) 


If the 2/2 phase shift is periodically switched into and out of the LO signal, the 
upper- and lower-sideband responses can be obtained as indicated by Eqs. (6.32) 
and (6.33). 

A similar implementation of sideband separation that makes use of fringe 
frequencies is attributable to B. G. Clark. This method is based on the fact that a 
small frequency shift in the first LO adds the same frequency shift to the fringes at 
the correlator for both sidebands, but a similar shift in a later LO adds to the fringe 
frequency for one sideband but subtracts from it for the other. Consider two antennas 
of an array in which the fringes have been stopped as in the discussion associated 
with expressions (6.26) and (6.27). Now suppose that we increase the frequency 
of the first LO at antenna n by a frequency v and decrease the frequency of the 
second LO by the same amount. The fringe frequency for the upper-sideband signal 
will be unchanged; that is, the fringes will remain stopped. For the lower sideband, 
the signal frequencies after the second mixer will be decreased by 26v. The lower- 
sideband output will consist of fringes at frequency 26v(1 — dt,/dt) ~ 26v and 
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will be averaged to a small residual if (25v)~! is small compared with the integration 
period at the correlator output, or if an integral number of fringe cycles fall within 
such an integration period. If the frequency of the second LO is increased by v 
instead of decreased, the lower sideband will be stopped and the upper one averaged 
out. To apply this scheme to an array of na antennas, the offset must be different for 
each antenna, and this can be achieved by using an offset nôv for antenna n, where n 
runs from 0 to ng— 1. An advantage of this sideband-separating scheme is that it can 
be implemented using the variable LOs required for fringe stopping, and no other 
special hardware is needed. Unlike the 2/2 phase-switching scheme, one sideband 
is lost in this method. However, as mentioned above, sideband separation schemes 
of this type separate only the correlated component of the signal and not the noise. 
To separate the noise, the SIS mixers at the receiver inputs can be mounted in a 
sideband-separating circuit of the type described in Appendix 7.1. In such cases, the 
isolation of the sidebands achieved in the mixer circuit may be only ~ 15 dB, which 
is sufficient to remove most of the noise contributed by an unwanted sideband, but 
not sufficient to remove strong spectral lines. The Clark technique described above is 
nicely suited to increasing the suppression of an unwanted sideband that has already 
suffered limited rejection at the mixer. 

Fringe-frequency effects can also be used for sideband separation in VLBI 
observations. In VLBI systems, the fringe rotation is usually applied during 
playback. Fringe rotation then has the effect of reducing the fringe frequency for 
one sideband and increasing it for the other. If the fringe rotation is set to stop the 
fringes in one sideband, then since the baselines are so long, fringes resulting from 
the other sideband will often have a sufficiently high frequency that they will be 
reduced to a negligible level by the time averaging at the correlator output. The data 
are played back to the correlator twice, once for each sideband, with appropriate 
fringe rotation. 


6.2 Response to the Noise 


The ultimate sensitivity of a receiving system is determined principally by the 
system noise. We now consider the response to the noise and the resulting threshold 
of sensitivity, beginning with the effect at the correlator output and the resulting 
uncertainty in the real and imaginary parts of the visibility, V. This leads to 
calculation of the rms noise level in a synthesized image in terms of the peak 
response to a source of given flux density. Finally, we consider the effect of noise in 
terms of the rms fluctuations in the amplitude and phase of V. 


6.2.1 Signal and Noise Processing in the Correlator 


Consider an observation in which the field to be imaged contains only a point source 
located at the phase reference position. Let V,,(t) and V,,(t) be the waveforms at the 
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correlator input from the signal channels of antennas m and n. The output is 
r = (Vm) Va (À) . (6.34) 


where all three functions are real, and the expectation denoted by the angular 
brackets is approximated in practice by a finite time average. To determine the 
relative power levels of the signal and noise components of r, we determine their 
power spectra by first calculating the autocorrelation functions. The autocorrelation 
of the signal product in Eq. (6.34) is 


P(t) = (Vin(t)Vn(Q) Vin(t — T)Vn(t — T)) . (6.35) 
This expression can be evaluated using the following fourth-order moment relation’: 
(21222324) = (21Z2)(z3Zz4) + (z1z3) (Zoz4) + (ziza) (z223) , (6.36) 


where z1, Z2, Z3, and z4 are joint Gaussian random variables with zero mean. Thus, 
(Vin(t)Vn(t)) (V(t — T)Vn(t — T)) 

=e (Vm (t) Vin (t = T))(Vn(t)Vn(t — T)) 

+ (Vin(Q)Vn(t — T)) (Vin (t — 1) Va) (6.37) 
Pin (0) + Pm(T)Pn(T) + Pmn(T)Pmn(—T) , 


II 


pr(T) 


II 


where Pm and p, are the unnormalized autocorrelation functions of the two signals 
Vin and V,,, respectively, and pm, is their cross-correlation function. Each V term is 
the sum of a signal component s and a noise component n, and to examine how these 
components contribute to the correlator output, we substitute them in Eq. (6.37). 
Products of uncorrelated terms, that is, products of signal and noise voltages, or 
noise voltages from different antennas, have an expectation of zero, and omitting 
them, we obtain 


Pr(T) = (SmO Sn (2) (Sm(t — T)Sn(t — T)) 
+ (SmÒSm(t — T) + Mn (Om (t — T)) (SOS (t — T) + n (Dn, (t — T)) 
+ (Sm(t)Sn(t n T))(Sm(t = T)Sn(t)) > (6.38) 


where the three lines on the right side correspond to the three terms on the last line 
of Eq. (6.37). To determine the effect of the frequency response of the receiving 


4This relation is a special case of a more general expression for the expectation of the product of 
N such variables, which is zero if N is odd and a sum of pair products if N is even. A form of 
Eq. (6.36) can be found in Lawson and Uhlenbeck (1950), Middleton (1960), and Wozencraft and 
Jacobs (1965). 
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system on the various terms of p(t), we need to convert them to power spectra. By 
the Wiener—Khinchin relation, we should therefore examine the Fourier transforms 
of each term on the right sides of Eqs. (6.37) and (6.38). 

The first term from Eq. (6.37), p2,,,(0), is a constant, and its Fourier transform is 
a delta function at the origin in the frequency domain, multiplied by p,,(0). From 
Eq. (6.38), we see that p?,, (0) involves only the signal terms, which it is convenient 
to express as antenna temperatures. By the integral theorem of Fourier transforms, 
Pmn(0) is the infinite integral of the Fourier transform of pmn(t), and thus the Fourier 
transform of p>, (0) is 


[ee] 2 
al CONGO il An(v) HF (v) «| A(v), (6.39) 


where k is Boltzmann’s constant, Tám and Tan are the components of antenna 
temperature resulting from the source, H,,(v) and H,,(v) are the frequency responses 
of the signal channels, and A(v) is the bandwidth. 

The Fourier transform of the second term of Eq. (6.37), Pm(T)Pn(T), is the 
convolution of the transforms of pm and pp, that is 


K (Tsm Tin Te + Tan) f H,(v) He (v)H, (v — v)Hž (v'—v)dv, (6.40) 


where Tsm and Tsn are the system temperatures. Note that the magnitude of this term 
is proportional to the product of the total noise temperatures. 

The Fourier transform of the third term of Eq. (6.37), Pmn(T)Pmn(—T), is the 
convolution of the transforms of Pmn(T) and Pmn(—T), and the latter is the com- 
plex conjugate of the former, since Pmn is real. Thus, the Fourier transform of 


Pmn(T) Pmn (=T) is 
[0,6] 
Pta Hn (v)H* (v)Hž (v! — v)H, (v' — v)dv . (6.41) 
—0o 


In expression (6.39), as in Eq. (6.37), only the antenna temperatures appear, because 
the receiver noise for different antennas makes no contribution to the cross- 
correlation. 

Expression (6.39) represents the signal power in the correlator output, and (6.40) 
and (6.41) represent the noise. The effect of the time averaging at the correlator 
output can be modeled in terms of a filter that passes frequencies from 0 to Avir. 
The output bandwidth Avrr is less than the correlator input bandwidth by several 
or many orders of magnitude. Therefore, the spectral density of the output noise 
can be assumed to be equal to its value at zero frequency, that is, for v’ = 0 
in (6.40) and (6.41). From these considerations, and because H,,(v) and H,(v) are 
Hermitian, the ratio of the signal voltage to the rms noise voltage after averaging at 
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the correlator output is 


Rsn = 
oo 
vV Ta H,,(v)H; v) dv 
S 
= , 
vV (Tam + Tsm)(Tan + Tsn) + TamT an 24v | |Hin(v)|?|Hn(v) |2dv 
—oo 


(6.42) 


where 2Av fp is the equivalent bandwidth after averaging, with negative frequencies 
included. It is unusual for R,n, the estimate of the SNR at the output of a simple 
correlator, to be required to an accuracy better than a few percent. Indeed, it is 
usually difficult to specify Ts to any greater accuracy since the effects of ground 
radiation and atmospheric absorption on Ts vary as the antennas track. Thus, it 
is usually satisfactory to approximate H,,(v) and H,,(v) by identical rectangular 
functions of width Avy. Also, in sensitivity calculations, one is concerned most 
often with sources near the threshold of detectability, for which T, « Ts. With 
these simplifications, Eq. (6.42) becomes 


TamTan | AVig 
sn = % 4 
s \ TsmTsn \ AVLF oe 


Figure 6.7 shows the signal and noise spectra for the rectangular bandpass approxi- 
mation. Note that the input spectra |H,,(v)|? and |H;,(v)|* contain both positive and 
negative frequencies and are symmetric about the origin in v. Thus, the output noise 
spectrum can be described as proportional to either the convolution or the cross- 
correlation function of |H,,(v)|? and |H,(v)|?. 

The output bandwidth is related to the data averaging time Ta since the averaging 
can be described as convolution in the time domain with a rectangular function of 
unit area and width t,. The power response of the averaging circuit as a function 
of frequency is the square of the Fourier transform of the rectangular function, 
that is, sin?(2t,v)/(at,v)*. The equivalent bandwidth, including both positive and 
negative frequencies, is 


f> e 252 
2An z= f sin (tav) yy _i , (6.44) 
—oo (mTav)? Ta 


Then from Eq. (6.43), we obtain 


Ron = 


( Tam Tan 


2AVIFTa . 6.45 
Tsm Tsn ) —_ ( ) 


Note that 2AvjpT, is the number of independent samples of the signal in time Ta. 
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Fig. 6.7 Spectra of (a) the input and (b) the output waveforms of a correlator. The input passbands 
are rectangular of width Avf. Shown in (b) is the complete spectrum of signals generated in the 
multiplication process, including noise bands at twice the input frequency. Only frequencies very 
close to zero are passed by the averaging circuit at the correlator output. These include the wanted 
signal, the spectrum of which has the form of a delta function and is represented by the arrow. It is 
assumed that Ta << Ts. 


If the source is unpolarized, each antenna responds to half the total flux density 
S, and the received power density is 


1 
kT, = 7⁄5 ; (6.46) 


where A is the effective collecting area of the antenna. For identical antennas and 
system temperatures, we obtain, from Eqs. (6.45) and (6.46), 


= AS | Avyptq 


Ra = È (6.47) 
kTs 2 


Similar derivations of this result can be found in Blum (1959), Colvin (1961), and 
Tiuri (1964). Usually the result in Eq. (6.47), in which we have assumed T, < Ts, 
is the one needed. At the other extreme, which may be encountered in observations 
of very strong, unresolved sources for which Ty >> Ts, we have Ry, = /Avyptq- 
The SNR is then determined by the fluctuations in signal level and is independent 
of the areas of the antennas. Anantharamaiah et al. (1989) give a discussion of noise 
levels in the observation of very strong sources. 
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From Fig. 6.7, we can see how the factor ./Avjpt, in Eq. (6.47), which enables 
very high sensitivity to be achieved in radio astronomy, arises. The noise within the 
correlator results from beats between components in the two input bands and thus 
extends in frequency up to Avr. The triangular noise spectrum in Fig. 6.7 is simply 
proportional to the number of beats per unit frequency interval. However, only the 
very small fraction of this noise that falls within the output bandwidth is retained 
after the averaging. Note that the signal bandwidth Avy that is important here is the 
bandwidth at the correlator input. In a DSB system, this is only one-half of the total 
input bandwidth at the antenna. 

One other factor that affects the SNR should be introduced at this point. If the 
signals are quantized and digitized before entering the correlators, a quantization 
efficiency yg related to the quantization must be included, and Eq. (6.47) becomes 


R _ ASno AVIFTa 
sn = kT; 7 , 


(6.48) 


or in terms of antenna temperature, 


T, 
Ra = e J2AvEt - (6.49) 


S 


Values of 7g vary between 0.637 and 1.0 and are discussed in Chap. 8 (see 
Table 8.1). In VLBI observing, other losses affect the SNR, as discussed in Sect. 9.7. 


6.2.2 Noise in the Measurement of Complex Visibility 


To understand precisely what Rsn represents, note that in deriving Eqs. (6.48) 
and (6.49), no delay was introduced between the signal components at the correlator, 
and the phase responses of the signal channels were assumed to be identical. Thus, 
the source is in the central fringe of the interferometer pattern, and the response is 
the peak fringe amplitude, which represents the modulus of the visibility. To express 
the rms noise level at the correlator output in terms of the flux density o of an 
unresolved source for which the peak fringe amplitude produces an equal output, 
we put Rsn = 1 in Eq. (6.48) and replace S by o: 


ZKT. 
o= A ya (6.50) 
no 


where ø is in units of W m~? Hz~!. Consider the case of an instrument with a 
complex correlator in which the output oscillations are slowed to zero frequency 
as described earlier. The noise fluctuations in the real and imaginary outputs are 
uncorrelated, as we now show. Suppose the antennas are pointed at blank sky so 
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Fig. 6.8 Complex quantity 
Z, which is the sum of the 
modulus of the true complex 
visibility V and the noise e. 
The noise has real and 
imaginary components of rms 
amplitude o, and ¢ is the 
phase deviation resulting Z 
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that the only inputs to the correlators in Fig. 6.3 are the noise waveforms nm, Mn, 
and n”, where the last is the Hilbert transform of nm produced by the quadrature 
phase shift. The expectation of the product of the real and imaginary outputs is 
(nmnan”na), which can be shown to be zero by using Eq. (6.36) and noting that the 
expectations (Amn), (Mmn'Z), and (nn) must all be zero. Thus, the noise from the 
real and imaginary outputs is uncorrelated." 

The signal and noise components in the measurement of the complex visibility 
are shown in Fig.6.8 as vectors in the complex plane. Here V represents the 
visibility as it would be measured in the absence of noise, which is assumed to 
be along the x, or real, axis; and Z represents the sum of the visibility and noise, 
V + e. We consider Z and e to be vectors whose components correspond to the 
real and imaginary parts of the corresponding quantities. The components of e are 
independent Gaussian random variables with zero mean and variance o”. Hence, the 
noise in both components of Z has an rms amplitude o, and 


(IZ) = [VP + 20? . (6.51) 


The factor of two arises because of the contributions of the real and imaginary parts 
of e. If the measurement is made using only a single-multiplier correlator, one can 
periodically introduce a quadrature phase shift at one input, thus obtaining real and 


5The noise in the correlator outputs is composed of an ensemble of components of frequency 
[Vm — Val, where vm and v, are frequency components of the correlator inputs nm and ny. 
Components of the imaginary output are shifted in phase by 7/2 relative to the corresponding 
components of the real output. Note that for any pair of input components, the sign of the shift in 
the imaginary output takes opposite values depending on whether Vm > Va Or Vm < vn. As a result, 
the noise waveforms at the correlator outputs are not a Hilbert transform pair, and one cannot be 
derived from the other. 
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imaginary outputs, each for half of the observing time. Then the data are half of 
those that would be obtained with a complex correlator, and the noise in the visibility 
measurement is greater by V2. 


6.2.3 Signal-to-Noise Ratio in a Synthesized Image 


Having determined the noise-induced error in the visibility, the next step is to 
consider the SNR in an image. Consider an array with n, antenna pairs, and suppose 
that the visibility data are averaged for time t, and that the whole observation covers 
a time interval tọ. The total number of independent data points in the (u, v) plane is 
therefore 
To 
Ng = np—. (6.52) 


a 


In imaging an unresolved source at the field center for which the visibility data 

combine in phase, we should thus expect the SNR in the image to be greater than 

that in Eqs. (6.48) and (6.49) by a factor „/npTo/Ta. This simple consideration gives 

the correct result for the case in which the data are combined with equal weights. 

We now derive the result for the more general case of arbitrarily weighted data. 
The ensemble of measured data can be represented by 


nd 
XO Pêl- ui, v — vD (Vi + e) + 75(u + ui, v + vV? + eF)] , (6.53) 


i=1 


where 75 is the two-dimensional delta function and ¢; is the complex noise 
contribution to the ith measurement. Each such data point appears at two (u, v) 
locations, reflected through the origin of the (u, v) plane. Before taking the Fourier 
transform of the data in Eq. (6.53), each data point is assigned a weight w; (the 
choice of weighting factors is discussed in Sect. 10.2.2). To simplify the calculation, 
we assume that the source is unresolved and located at the phase reference point of 
the image and therefore produces a constant real visibility V equal to its flux density 
S. The intensity at the center of the image is then 


nd 
J o wi(V + eri) 
a — (6.54) 


Ev 


where eg; is the real part of the noise, ¢;. Note that the imaginary part of €; 
vanishes at the origin of the image when the conjugate components are summed. 
For neighboring points in the image, the same rms level of noise is distributed 
between the real and imaginary parts of £. The expectation of Jp is 
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(bh) =V=S, (6.55) 


since (€g;) = 0. The variance of the estimate of the intensity, 07, is 


2 (62 
a2 = (8) — Uy)? = ee 
(£r) 


Equation (6.56) is derived directly from Eq. (6.54) using the fact that the noise terms 
from different (u, v) locations are uncorrelated, that is, (egiegj) = 0, for i Æ j. 
We define the mean weighting factor Wmean and rms weighting factor wms by the 
equations 


(6.56) 


Wmean = — Wi (6.57) 
Nd 
and 
2 1 2 
Wangs = — wF. (6.58) 


The rms noise contribution [see Eq. (6.51)] is the same for each (u, v) point and is 
equal to (Eki) = 0°, where ø is given by Eq. (6.50). Thus, the SNR can be calculated 
from Eqs. (6.55), (6.56), (6.57), and (6.58) as 


I S mean 
Mo) _ Sofa Wmea (6.59) 
Om oO Wrms 


For an array with complex correlators, we have, from Eq. (6.50), 


(Io) — ASNoy Nd ÅVIFTa Wmean (6 60) 
Om J 2KTs Wrms ` ` 


If combinations of all pairs of antennas are used, np = $Na(Na — 1), where nz, is the 
number of antennas. Since, from Eq. (6.52), ng = npto/Ta, we obtain 


(Io) = ASnov Na(Na = 1) Avypto Wmean (6.61) 


Om 2k Ts Wrms 
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To express the rms noise level in terms of flux density, we put Jo/o, = 1 in 
Eq. (6.61). S then represents the flux density of a point source for which the peak 
response is equal to the rms noise level. If we represent this particular value of S by 
Sims, then 


2kTs Wrms 
Sms = ————————— ; (6.62) 
Anoy Na(Na = 1) Avyrto Wmean 
If all the weighting factors w; are equal, Wmean/Wims = 1, and this situation 


is referred to as the use of natural weighting. In such a case, the SNR given 
by Eq. (6.61) is equal to the corresponding sensitivity for a total-power receiver 
combined with an antenna of aperture /mq(mq — 1)A, which approaches n,A as 
Na becomes large. For an analysis of the sensitivity of single-antenna systems, see 
Appendix 1.1. 

We have considered the point-source sensitivity in Eq. (6.62). In the case of a 
source that is wider than the synthesized beam, it is useful to know the brightness 
sensitivity. The flux density (in W m~? Hz7') received from a broad source of mean 
intensity J (W m~? Hz! sr!) across the synthesized beam is I2, where Q is 
the effective solid angle of the synthesized beam. Thus, the intensity level that is 
equal to the rms noise is Sıms/ 2. Note that the brightness sensitivity decreases as 
the synthesized beam becomes smaller, so compact arrays are best for detecting 
broad, faint sources. However, to measure the intensity of a uniform background, 
a measurement of the total power received in an antenna is required because a 
correlation interferometer does not respond to such a background. 

The ratio Wmean/Wrms is less than unity except when the weighting is uniform. 
Although the SNR depends on the choice of weighting, in practice, this dependence 
is not highly critical. The use of natural weighting maximizes the sensitivity for 
detection of a point source in a largely blank field but can also substantially broaden 
the synthesized beam. The advantage in sensitivity is usually small. For example, 
if the density of data points is inversely proportional to the distance from the (u, v) 
origin, as is the case for an east-west array with uniform increments in antenna 
spacing, the weighting factors required to obtain effective uniform density of data 
result in Wmean/Wrms = 2/2/ 3 = 0.94. In this case, the natural weighting results in 
an undesirable beam profile in which the response remains positive for large angular 
distances from the beam axis and dies away only slowly. 

Methods of Fourier transformation of visibility data are reviewed in Chap. 10, 
and the results derived in Eqs. (6.61) and (6.62) can be applied to these by using the 
appropriate values of Wmean and Wrms. Convolution of the visibility data in the (u, v) 
plane to obtain values at points on a rectangular grid is a widely used process. In 
general, the data at adjacent grid points are then not independent, and a tapering of 
the signal and noise is introduced into the image. Aliasing can also cause the SNR 
to vary across the image. (These effects are explained in Fig. 10.5 and the associated 
discussion.) In such cases, the results derived here apply near the origin of the image, 
where the effects of tapering and aliasing are unimportant. The rms noise level over 
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the image can be obtained by the application of Parseval’s theorem to the noise in 
the visibility data (see Appendix 2.1). 

In practice, a number of factors that affect the SNR are difficult to determine 
precisely. For example, Ts varies somewhat with antenna elevation. There are also a 
number of effects that can reduce the response to a source without reducing the 
noise, but these are important only for sources not near the (/,m) origin of an 
image. These include the smearing resulting from the receiving bandwidth and from 
visibility averaging, discussed later in this chapter, and the effect of non-coplanar 
baselines, discussed in Sect. 11.7. 

Note also that in many instruments, two oppositely polarized signals (with 
crossed linear or opposite circular polarizations) are received and processed using 
separate IF amplifiers and correlators. For unpolarized sources, the overall SNR is 
then ./2 greater than the values derived above, which include only one signal from 
each antenna. 


6.2.4 Noise in Visibility Amplitude and Phase 


In synthesis imaging, we are usually concerned with data in the form of the real and 
imaginary parts of the complex visibility V, but sometimes it is necessary to work 
with the amplitude and phase. The sum of the visibility and noise is represented 
by Z = Ze/*, where we choose the real axis so that the phase @ is measured with 
respect to the phase of V, as in Fig. 6.8. Then for T4 < Ts (the antenna temperature 
resulting from the source is much less than the system temperature), the probability 
distributions of the resulting amplitude and phase are 


Z Z + |V? ZY 
p(Z) = = exp (=) h (472) À Z>0 (6.63a) 
oO 20 o 


1 IV\? [x |V| coso |V|? cos? @ 
= — —_ 1 = Ls 
Po) Qn exp ( 20? ) ! = re a | 20? 


x [1 + ene( ee) i (6.63b) 


and erf is the error function (Abramowitz and Stegun 1968). 


Xx 2 i —?/2 
erf a = Wea e dt , (6.63c) 
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where J is the modified Bessel function of zero order and ø is as given by Eq. (6.50). 
The amplitude distribution is identical to that for a sine wave in noise, and the 
derivation is given by Rice (1944, 1954), Vinokur (1965), and Papoulis (1965), 
of which the last two also derive the result for the phase. p(Z) is sometimes 
referred to as the Rice distribution, and for V = 0, it reduces to the Rayleigh 
distribution. Curves of p(Z) and p(@) are given in Fig. 6.9. Comparison of the curves 


(a) 


o p(Z) 


(b) 


p(>) 


Fig. 6.9 Probability distributions of (a) the amplitude, and (b) the phase, of the measured complex 
visibility as functions of the SNR. |V| is the modulus of the signal component. Reprinted from 
Moran (1976), © 1976, with permission from Elsevier. 
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for |V|/o = 0 and 1 indicates that the presence of a weak signal is more easily 
detected by examining the visibility phase than by examining the amplitude. 

Approximation for p(Z) and p(@) for the cases in which |V|/o < 1 and 
|V|/o > 1 are given in Sect. 9.3. Expressions for the moments of Z and ¢ and 
their rms deviations are also given in that section. The rms phase deviation og 
is a particularly useful quantity, especially for astrometric and diagnostic work. 
The expression for og, valid for the case in which |V|/o > 1, is og > o/|V| 
[Eq. (9.67)]. This result also follows intuitively from an examination of Fig. 6.8. 
By substituting Eq. (6.50) into the expression for oy, setting |V| equal to the flux 
density S of the source, which is appropriate if the source is unresolved, and using 
Eq. (6.46) to relate the flux density and antenna temperature, we obtain 


Ts 


OF a —— 
$ nNoTavy 2QAVIFTa 


This equation is valid for the conditions Ts/./2Avjpt, «K Ta « Ts, which are 
the conditions most frequently encountered, and is useful for determining whether 
the noise in the phase measurements of an interferometer is due exclusively to 
receiver noise. Excess phase noise can be contributed by the atmosphere, by system 
instabilities, and, in the case of VLBI, by the frequency standards. 


(6.64) 


6.2.5 Relative Sensitivities of Different Interferometer Systems 


Next we compare the sensitivity of several different interferometer systems, using 
as a measure of sensitivity the modulus of the signal divided by the rms noise, that 
is, V/e in terms of the quantities at the correlator output in Fig. 6.8. Parameters 
such as averaging times and IF bandwidths are the same for all cases considered. To 
compare DSB and SSB cases, it is convenient to introduce a factor 


double-sideband system temperature of double-sideband system (6.65) 
a = ss t : 
system temperature of single-sideband system 


Recall that the system temperature of a receiver can be defined as the noise 
temperature of a thermal source at the input of a hypothetical noise-free (but 
otherwise identical) receiver that would produce the same noise level at the receiver 
output. [Equation (1.4) can be used for the equivalent noise temperature if the 
Rayleigh-Jeans approximation does not apply.] For a DSB receiver, the system 
temperature is described as DSB or SSB depending on whether the thermal noise 
source emits noise in both sidebands or only one. With these definitions, the SSB 
noise temperature is twice the DSB noise temperature. 

For an SSB system, the rms noise from one output of a correlator (either the real 
or imaginary output in the case of a complex correlator) is o after averaging for a 
time Ta, as given by Eq. (6.50). The corresponding noise power is o7. For a DSB 
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system, the rms output noise at a correlator output is 2a0. In all cases, the signal 


res 
fro 


sig 


ults from an unresolved source. For an SSB system, we take the signal voltage 
m the correlator output to be V, as in Fig. 6.8. For a DSB system with the input 
nal in one sideband only, the signal at the correlator output is V, and for a DSB 


system with input in both sidebands, the correlator output is 2V. 


Values of the relative sensitivity for various systems are discussed below and 


summarized in Table 6.1. Similar results are given by Rogers (1976). 


1. 


SSB system with complex correlator. The output signal is V, and the rms noise 
from each correlator output is o. As shown by Fig. 6.9 and Eq. (6.51), the ratio of 
the signal amplitude to rms noise is V/(/2c). We shall take this as the standard 
with respect to which the relative sensitivities of other systems are defined. 


. SSB system and simple correlator with fringe fitting. To measure both the real and 


imaginary parts of the complex visibility, the fringes are not stopped but appear as 
a sinusoid of amplitude V at the fringe frequency vy. The signal is accompanied 
by noise of rms amplitude o. The amplitude and phase are measured by “fringe 
fitting,” that is, performing a least-mean-squares fit of a sinusoid to the correlator 
output. This procedure involves multiplying the correlator output waveform 
by cos(2z vet) and sin(2 vt) and integrating over the period Ta. The results 


Table 6.1 Relative signal-to-noise ratios for several types of systems 


System type Relative SNR 


1. 


2 
3 
4. 
5 


6a. 


6b. 


Ta. 


7b. 


SSB with complex correlator 
SSB with simple correlator 


SSB, simple correlator, fringe stopping, 1/2 phase switching 


DSB, simple correlator," fringe fitting, continuum signal 


DSB, simple correlator, fringe stopping, 7/2 phase switching, continuum Z 
signal 

DSB, fringe stopping, sideband-separation [Eqs. (6.30) to (6.33)], signal + 
in one sideband only 


As (6a), but for continuum signal and visibilities in both sidebands Tin 
combined 
VLBI, DSB, complex correlator, one sideband removed by averaging of + 


fast fringes 

As (7a), but for continuum signal, correlated separately for each sideband Vu 
and results combined 

SSB, digital spectral correlator with simple correlator elements and 1 
correlation measured as a function of time offsets (see Sect. 8.8) 


*For DSB with complex correlator, see text pertaining to Fig. 6.5 


6.2 Response to the Noise 237 


represent the real and imaginary parts, respectively, of the cross-correlation. 
We calculate the effects of fringe fitting on the signal and noise separately and 
assume, with no loss of generality, that the fringes are in phase with the cosine 
component in the fringe fitting, in which case the sine component of the signal 
is zero. The correlator output has a bandwidth Av, which is sufficient to pass 
the fringe-frequency waveform, and it is sampled at time intervals t, = 1/(2v,) 
and digitized. Within the period Ta, there are N = 2Av,t, samples. Thus, for the 
cosine component of the signal, the amplitude is 


L< y ve 
N XO V cos? (2xivyts) => + ON 2 cos(4rivfTs) . (6.66) 


i=1 


The second term on the right side represents the end effects and is approximately 
zero if there are an integral number of half-cycles of the fringe frequency within 
the period Ta. It also becomes relatively small as vrt, increases, and we assume 
here that there are enough fringe cycles (say, ten or more) within time t, that end 
effects can be neglected. To determine the effect of fringe fitting on the noise, 
we represent the sampled noise by n(it;), multiply by the cosine function, and 
determine the variance (mean squared value). Averaged over time Ta, the result is 


i i 
— ) n(iTs) COS(27iveTs) 
i=1 


N N 
=i 5 > n(it;) cos(27rivfTs) n(kts) COS(2ItkvpTs) . (6.67) 


i=1 k=1 


We need to determine the expectation value of this expression, denoted by angle 
brackets. Only terms for which i = k contribute to the expectation. Thus, the 
noise variance becomes 


N 2 
(= X hito + cost) = > f (6.68) 


i=1 


This result shows that half of the noise power, o’, that is available at the 
correlator output appears in the cosine component of the fringe fitting. Similarly, 
the other half appears in the sine component. The combined rms noise of the 
two components is o, and the SNR after fringe fitting is V/(20). The relative 
sensitivity is 1//2. 

3. SSB system with simple correlator and 1/2 phase switching of LO. In this case, 
the fringes have been stopped, and to determine the complex visibility, a phase 
change of 2/2 is periodically inserted into one oscillator [e.g., 0, — On + 2/2 
in Eq. (6.11) or (6.15)] so that the correlator is effectively time-shared between 
the real and imaginary parts of the cross-correlation function, which are averaged 
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separately. The visibility phase can thereby be determined. The signal in the two 
phase conditions is V cos(@¢,) and V sin(¢,), and the rms noise associated with 
each of these terms is /2o (the 2 factor enters because the noise in each output 
is averaged over time t/2 only). Thus, the modulus of the signal is V and the 
rms noise from the two components is 20. The SNR is V/(2c) and the relative 
sensitivity is 1//2. 

4. DSB system with simple correlator and fringe fitting. We consider the case of a 
continuum source with signal in both sidebands and assume that the instrumental 
delay is adjusted so that the signal appears entirely in the (real) output of a 
simple correlator, as a fringe-frequency cosine wave of amplitude V. In terms 
of Eq. (6.18), the factor cos(27 vp ATa + Gg) is unity. Then for the DSB system, 
the signal amplitude is 2V, and the rms noise is 200. The fringe-fitting procedure 
follows that of case 2, but in this case, the signal amplitude is greater by a factor 
of two and is equal to V. The rms noise is greater by a factor of 2a. Thus, the 
SNR is V/(2ao), and the relative sensitivity is 1/(/2a). 

5. DSB system with simple correlator and 1/2 phase switching of LO. Here, the 
fringes have been stopped, and to determine the visibility phase, it is necessary 
to perform 7z /2 phase switching as in case 3 above. (For a DSB system, the phase 
switching must be on the first LO.) The amplitude of the signal is 2V because the 
system is DSB, and the rms noise level from the correlator output is increased to 
2./2a0 because the averaging time for each component is reduced to t,/2 by the 
time sharing of the correlator between the two phase conditions. This rms level 
is associated with both the cosine and sine components of the signal, so the SNR 
is V/(2ac). The relative sensitivity is 1/(/2q). 

6. One sideband of a DSB system with 2/2 phase switching of the LO and 
sideband separation after correlation. A complex correlator is used, and the 
procedure corresponding to Eqs. (6.30) to (6.33) is followed. We consider the 
upper sideband and ignore lower-sideband signal terms. The components r1, r2, 
r3, r4 have amplitudes V multiplied by the cosine or sine of W%,. Thus, from 
Eqs. (6.30) and (6.31), ignoring lower-sideband terms, the right side of Eq. (6.32) 
becomes 5(2V cos W, + j2V sin W,), the modulus of which is V. The rms noise 
associated with each term r1, r2, r3, and r4 is 2./2a0 since the system is DSB, 
and because of the LO switching, the effective averaging time is t,/2. Thus, the 
rms noise associated with the right side of Eq. (6.32) is 2\/2ao, as in case 5. 
The SNR is V/(2./2a0), and the relative sensitivity is 1/(2a). This applies to a 
signal in one sideband such as a spectral line. For a continuum source, the cross- 
correlation can be measured for each of the two sidebands, and if the results are 
then averaged, the relative sensitivity becomes 1 /(./2a). The terms rz and r4 are 
eliminated in averaging the right sides of Eqs. (6.32) and (6.33), and the result 
is the same as for a simple correlator with LO phase switching described under 
case 5 above. 

7. VLBI observations with a DSB system and complex correlator. In VLBI obser- 
vations, a DSB system is sometimes used, and fringe rotation is inserted after 
playback of the recorded signal, as mentioned in Sect. 6.1. For one sideband, 
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the fringes are stopped, but for the other, they are lost in the averaging at the 
correlator output because the fringe frequencies are high. Thus, for one playback, 
we have the signal of an SSB system and the noise of a DSB system in each of 
the real and imaginary outputs, that is, an SNR of V/(2/2ac) and a relative 
sensitivity of 1/(2a@) for each individual sideband. 

8. Measurement of cross-correlation as a function of time offset. Digital spectral 
correlators that measure cross-correlation as a function of time delay are 
described in Sect. 8.8. In a lag-type correlator, the cross-correlation is measured 
as a function of time offset, implemented by introducing instrumental delays. 
The Fourier transform of the cross-correlation as a function of relative time delay 
between the signals is the cross-correlation as a function of frequency, as required 
in spectral line measurements. As mentioned in Sect. 6.1.7, it is necessary to 
use only simple correlators for this measurement. The range of time offsets 
of the two signals covers both positive and negative values, and the resulting 
measurements of cross-correlation contain both even and odd components. 
Fourier transformation then provides both the real and imaginary components of 
the cross-correlation as a function of frequency. The full sensitivity is obtained so 
long as the range of time offsets is comparable to the reciprocal signal bandwidth 
or greater; see Table 9.7. Note that in Table 6.1, we have not included the 
quantization loss discussed in Sect. 8.3.3. A demonstration of the sensitivity 
using a simple correlator when the measurements are made as a function of time 
delay is given by Mickelson and Swenson (1991). 


Of the cases included in Table 6.1, the SSB with complex correlator is the 
one generally used where possible, because of the sensitivity and avoidance of 
the complications of DSB operation. Cases 2 and 3 in the table are included 
mainly for completeness of the discussion. As mentioned earlier, for frequencies 
of several hundred gigahertz, the most sensitive type of receiver input stage may 
be an SIS mixer. This has an inherently double-sided response, and if necessary, 
a sideband can be removed by filtering or using a sideband-separating arrangement 
(Appendix 7.1). For DSB operation, the most important cases in Table 6.1 are 6a and 
6b. The case in which the unwanted sideband is only partially rejected is discussed 
in Appendix 6.1. 


6.2.6 System Temperature Parameter a 


As already noted, DSB systems are mainly used at millimeter and submillimeter 
wavelengths, at which the receiver input stage is commonly a cooled SIS mixer. 
Such a system can be converted to SSB operation by filtering out the unwanted 
sideband and terminating the corresponding input in a cold load. If the atmospheric 
losses are high and the receiver temperature is low, most of the system noise 
will come from the antenna, and terminating one sideband in a cold load will 
approximately halve the level of noise within the receiver. The system temperature 
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of the SSB system will then be approximately equal to the double-sideband system 
temperature of the DSB system, and the value of a [defined in Eq. (6.65)] tends 
toward 1. On the other hand, if atmospheric and antenna losses are low and most of 
the system noise comes from the mixer and IF stages, then terminating one sideband 
input in a cold load rather than the cold sky makes little difference to the noise level 
in the receiver. The system temperature of the SSB system will be close to the single- 
sideband system temperature of the DSB system, which is twice the DSB value. The 
value of a then tends toward 1/2. To recapitulate: If the atmospheric noise dominates 
the receiver noise, then «œ tends toward 1, but if the receiver noise dominates, then œ 
tends toward 1/2. Note, however, that a is not confined to the range 1/2 < œ < 1. 
For example, if noise from the antenna is low but the termination of the image 
sideband in the SSB system is uncooled and injects a high noise level, then œ can 
be < 1/2. If the front end is tuned close to an atmospheric absorption line in such 
a way that the additional sideband of the DSB system falls in a frequency range of 
enhanced atmospheric noise, then a can be > 1. 


6.3 Effect of Bandwidth 


As seen in the preceding section, the sensitivity of a receiving system to a broadband 
cosmic signal increases with the system bandwidth. Here we are concerned with the 
effect of bandwidth on the angular range over which fringes are detected, and on 
the fringe amplitude. These effects result from the variation of fringe frequency, in 
cycles per radian on the sky, with the received radio frequency. If the monochromatic 
response is integrated over the bandwidth, the fringes are reinforced for directions 
close to that for which the time delays from the source to the correlator inputs are 
equal, but for other directions, the fringes vary in phase across the bandwidth. This 
effect, when measured in a plane containing the interferometer baseline, causes the 
fringe amplitude to decrease with angle in a manner similar to that caused by the 
antenna beams (Swenson and Mathur 1969) and is sometimes referred to as the 
delay beam. It can be used to confine the response of an interferometer to a limited 
area of the sky and thereby reduce the possibility of source confusion, which can 
occur when the fringe patterns of two or more sources are recorded simultaneously. 
Examples of such usage can be found in some early interferometers built for 
operation at frequencies below 100 MHz (Goldstein 1959; Douglas et al. 1973). 


6.3.1 Imaging in the Continuum Mode 


The effect of bandwidth on the fringe amplitude was discussed in Sect. 2.2. 
Equation (2.3) gives an expression for the fringes observed for a point source with 
an east-west baseline of length D and a rectangular signal passband of width Av. 


6.3 Effect of Bandwidth 241 


The fringe amplitude is proportional to a factor 


, _ sin(zDIAv/c) 


= 6.69 
b mDIAv/c en) 


Consider an array for which D is typical of the longest baselines. The synthesized 
beamwidth of the array, 0), is approximately equal to Ag/D = c/voD, where vo 
is the observing frequency and Ao the corresponding wavelength. (Note that in this 
section, vo is the center frequency of the RF input band, not an IF band.) Thus, 
Eq. (6.69) becomes 


à sin(x Avl/ vop) (6.70) 
m Avl/ voh, 

The parameter Avl/voĝ, is equal to the fractional bandwidth multiplied by the 
angular distance of the source from the (l, m) origin measured in beamwidths. If this 
parameter is equal to unity, R}, = 0 and the measured visibility is reduced to zero. To 
keep Rj, close to unity, we require Av//vo, « 1. Thus, to avoid underestimation 
of the visibility at long baselines, there is a limit on the angular size of the image 
that is inversely proportional to the fractional bandwidth. 

We now examine the same effect in more detail by considering the distortion in 
the synthesized image. First, recall that the response of an array can be written as 


V(u, v)W (u, v) <=> I(l, m) * x bo(l, m) , (6.71) 


where <—> represents Fourier transformation, and the double asterisk represents 
convolution in two dimensions. The fringe visibility is multiplied by W(u, v), the 
spatial sensitivity function of the array for a particular observation. The Fourier 
transform of the left side of Eq. (6.71) gives the intensity distribution I(l, m) 
convolved with the synthesized beam function bo(/,m). For simplicity, we have 
omitted the primary antenna beam and minor effects related to use of the discrete 
Fourier transform. The synthesized beam is defined here as the Fourier transform of 
W(u, v). 

In operation in the continuum mode, the visibility data measured with bandwidth 
Av are treated as though they were measured with a single-frequency receiving 
system tuned to the center frequency, vo. Thus, for all frequencies within the 
bandwidth, the assigned values of u and v are those appropriate to frequency vo. 
At another frequency v within the passband, the true spatial frequency coordinates 
u, and v, are related to the assigned values u and v by 


UyVo Vy -) 


(u, v) = ( Paar (6.72) 
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The contribution to the measured visibility from a narrow band of frequencies 
centered on v is 


2 
v v I 
ie ee w a PES (=) (2. =) , (6.73) 


where we have used the similarity theorem of Fourier transforms. Thus, the 
contribution to the measured intensity is the true intensity distribution scaled in 
(L, m) by a factor v/vo and in intensity by (v/vo)*. The derived intensity distribution 
is convolved with bo (l, m), the synthesized beam corresponding to frequency vo. We 
have assumed that the beam does not vary significantly with frequency and have 
used the same spacial sensitivity function W (u, v) to represent the whole frequency 
passband. The overall response is obtained by integrating over the passband with 
appropriate weighting and is 


eo / yy? lv mv 
MOLE 
0 Vo Vo Vo 


Ja 
= *  o(L, m) . (6.74) 
Í \Hee(v) dv 
0 


I,(l,m) = 


Note that the integrals must be taken over the whole radio-frequency passband, 
denoted by the subscript RF, which includes both sidebands in the case of a DSB 
system. We assume that the passband function Hrr(v) is identical for all antennas. 
The values of / and m in the intensity function in Eq. (6.74) are multiplied by the 
factor v/vp, which varies as we integrate over the passband, being equal to unity 
at the band center. Thus, one can envisage the integrals in the square brackets in 
Eq. (6.74) as resulting in a process of averaging a large number of images, each 
with a different scale factor. The scale factors are equal to v/vo, and the range of 
values of v is determined by the observing passband. The images are aligned at the 
origin, and thus the effect of the integration over frequency is to produce a radial 
smearing of the intensity distribution before it is convolved with the beam. The 
response to a point source at position (l, m) is radially elongated by a factor equal 
to vÊ + m? Av/vo. For distances from the origin at which the elongation is large 
compared with the synthesized beamwidth, features on the sky become attenuated 
by the smearing, so there is an effective limitation of the useful field of view. The 
measured intensity is the smeared distribution convolved with the synthesized beam. 

Details of the behavior of the derived intensity distribution can be deduced from 
Eq. (6.74). For example, suppose that the beam contains a circularly symmetrical 
sidelobe at a large angular distance from the beam axis and that in an image, the 
response to a distant source causes the sidelobe to fall near the origin. Is the sidelobe 
broadened near the origin? Since the distant source is elongated, the sidelobe will be 
smeared in a direction parallel to that of a line joining the source and the origin, as 


SIf f(x) has a Fourier transform F(s), then f(ax) has a Fourier transform |a|~!F(s/a) (Bracewell 
2000). 
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Fig. 6.10 Radial smearing 
resulting from the bandwidth 
effect for a point source at 
(L, mı). The effects on the 
responses of the main beam 
and a ringlobe (i.e., a sidelobe 
of the form in Fig. 5.15) are 
shown. 


Main Beam 
Be. Response 
Source 


Ringlobe 
Response 


shown in Fig. 6.10. It will be broadened near the origin but not at a point 90° around 
the sidelobe as measured from the source. 

To estimate the magnitude of the suppression of distant sources, it is useful to 
calculate R}, the peak response to a point source at a distance rı from the origin 
of the (/,m) plane, as a fraction of the response to the same source at the origin. 
Because the effect we are considering is a radial smearing, we need consider only 
the intensity along a radial line through the (l, m) origin, as shown in Fig. 6.11a. We 
use idealized parameters; the passband is represented by a rectangular function of 
width Av and the synthesized beam by a circularly symmetrical Gaussian function 
of standard deviation op = @,//81n2, where 0, is the half-power beamwidth. 
For simplicity, the factor (v/vo)? in the integral in the numerator of Eq. (6.74) is 
omitted. The convolution becomes a one-dimensional (radial) process, as shown in 
Fig. 6.11b. The radially elongated source is represented by a rectangular function 
from rı(1 — Av/2vo) to rı(1 + Av/2vo), normalized to unit area. The beam is 
represented by the function e 7/205 , which is normalized to unity on the beam axis. 
When the beam is centered on the source, as shown in Fig. 6.11, R, is given by 


ri (1+Av/2v0) 
R, = a J er) /205 qr 
rı Av rı(1—4v/2v0) 


II 


Mp reed ad ert ( ) 
rı Av 2/2040 


Opvo rı Av 
= 1.0645 FI erf 0.83267, . (6.75) 
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Fig. 6.11 Response of an array with a broadband receiving system to a point source at distance 
rı from the origin of the (/, m) plane. (a) The point source (delta function) at r; becomes radially 
broadened into a rectangular function of unit area indicated by the heavy line. (b) Cross section of 
the intensity distribution in the r direction. The synthesized beam is represented by the Gaussian 
function. The peak intensity of the response to the source is proportional to the shaded area. 


A curve of R, as a function of the parameter rı Av/0pvo, which is the distance of 
the source from the origin measured in beamwidths, multiplied by the fractional 
bandwidth, is shown in Fig. 6.12. Values of 0.2 and 0.5 for this parameter reduce the 
response by 0.9% and 5.5%, respectively. 

If the receiving passband is represented by a Gaussian function of equivalent 
width Av (i.e., standard deviation = Av/2.5066), the reduction factor becomes 


1 
————— (6.76) 


VI + (0.9397, Av/6,¥0)2 _ 


A curve of this function is also included in Fig. 6.12. Comparison of the two curves 
illustrates the dependence on the passband shape. 
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Fig. 6.12 Relative amplitude of the peak response to a point source as a function of the distance 
from the field center and either the fractional bandwidth or the averaging time. 


6.3.2 Wide-Field Imaging with a Multichannel System 


Broadband images can also be obtained by observing with a multichannel system 
(i.e., a spectral line system as described in Sect. 8.8.2). In this case, the passband is 
divided into a number of channels by using either a bank of narrowband filters or 
a multichannel digital correlator. The visibility is measured independently for each 
channel, so the values of u and v can be scaled correctly and an independent image 
obtained for each channel. This scaling causes the spatial sensitivity function to vary 
over the band, and at frequency v, the synthesized beam is (v /vo)?bo(lv/vo, mv/vo), 
where bo(/,m) is the monochromatic beam at frequency vo. The images can be 
combined by summation, and if given equal weights, the result for N channels is 
represented by 


1 a HN? lv; mvi 
I(l, = —) b{—,—)]. 6.77 
(l, m) * * s} (>) (2 i) (6.77) 
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In this case, there is no smearing of the intensity distribution, but the beam suffers a 
radial smearing that has the desirable effect of reducing distant sidelobes. Therefore, 
this mode of observation is well suited for imaging wide fields. The improvement 
in the beam results from the increase in the number of (u, v) points measured, an 
effect that is also used in multifrequency synthesis discussed in Sect. 11.6. 


6.4 Effect of Visibility Averaging 
6.4.1 Visibility Averaging Time 


In most synthesis arrays, the output of each correlator is averaged for consecutive 
time periods, Ta, and thus consists of real or complex values spaced at intervals 
T in time. It is advantageous to make t, long enough to keep the data rate 
from the correlator readout conveniently small. An upper limit on Ta results from 
a consideration of the sampling theorem discussed in Sect. 5.2.1 and is briefly 
explained as follows. In discrete Fourier transformation of the visibility to intensity, 
the data points are often spaced at intervals Au and Av, as shown in Fig. 5.3. If the 
size of the field to be imaged is 6p in the / and m directions, then Au = Av = 1/6. 
In time Ta, the motion of a baseline vector within the (u, v) plane should not be 
allowed to exceed Au; otherwise, the visibility values will not fully represent the 
angular variation of the brightness function. 

Consider the case in which the longest baseline is east-west in orientation and 
the source under observation is at a high declination, which results in the fastest 
motion of the baseline vector. If the baseline length is D} wavelengths, the vector in 
the (u, v) plane traces out an approximately circular locus, the tip of which moves 
at a speed of wD, wavelengths per unit time, where œe is the angular velocity 
of rotation of the Earth. Thus, we require that t2w,.D, < 1/ Or, which results, in 
practice, in Ta ~ C/(@eD10f), where C is a factor likely to be in the range 0.1- 
0.5. Note that D1 0p is approximately the number of synthesized beamwidths across 
the field, and thus t, must be somewhat smaller than the time taken for the Earth 
to rotate through one radian, divided by this number. Although shorter baselines 
could be averaged for longer times, in most synthesis arrays, all correlator outputs 
are read at the same time, at a rate appropriate for the longest baselines. Another 
consideration is that sporadic interference can be edited out of the data with minimal 
information loss if T4 is not too long. For large arrays, Ta is generally in the range of 
tens of milliseconds to tens of seconds. Determining the visibility at the (Au, Av) 
grid points from the sampled data on the (u, v) loci is discussed in Sect. 10.2.3. 


6.4.2 Effect of Time Averaging 


We now examine in more detail the effect of the averaging on the synthesized 
intensity distribution. In reducing the data, all visibility values within each interval 
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Ta are treated as though they applied to the time at the center of the averaging 
period. Thus, for example, the measurements at the beginning of each averaging 
period enter into the visibility data with assigned values of u and v that apply to 
times t,/2 later than the true values. In effect, the resulting image consists of the 
average of a large number of images, each with a different timing offset distributed 
progressively throughout the range —t,/2 to t,/2. These timing offsets apply only 
to the assignment of (u, v) values and do not resemble a clock error that would affect 
the whole receiving system. 

Consider an unresolved source, represented by a delta function. To simplify the 
situation, we consider observations with east-west baselines and examine the effects 
in the (w’,v’) plane and the corresponding (l, m’) sky plane (see Sect. 4.2). The 
spacing loci are circular arcs generated by vectors rotating at angular velocity we, as 
shown in Fig. 6.13a. Consider first the case of an east—west linear array; then, of the 
antenna spacing components (X, Y, Z) defined in Fig. 4.1, only Y is nonzero. The 
circular arcs of the spacing loci are centered on the (w, v’) origin as in Fig. 6.13b, 
and a timing offset 6¢ is equivalent to a rotation of the (w’,v’) axes through an 
angle weôt. The visibility of the source is the combination of two sets of sinusoidal 
corrugations, one real and one imaginary: 


(L, m ) <—> cos 2x(u'l, + v'm )— jsin2r(u l + v'm). (6.78) 
The angle of the corrugations is related to the position angle y’ = tan™! (mi / L) 
of the point source, as shown in Fig. 6.14. A change in y’ causes an equivalent 


rotation of the corrugations and vice versa. For an east-west array, time offsets 
therefore correspond to proportional rotations of the intensity in the (/’, m’) plane. It 


(a) v' (b) v’ 


Fig. 6.13 Spacing loci in the (u’, v’) plane, (a) for the general case and (b) for an east-west 
baseline. The angle WeTa over which the averaging takes place is enlarged for clarity: for example, 
with an averaging time of 30 s, the angle would be 7.5 arcmin. 
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(a) (b) 


Fig. 6.14 (a) Point source at (/, m{) and (b) the real part of the corresponding visibility function. 
The ridges of the sinusoidal corrugations that represent the visibility in the (w’, v’) plane are 
orthogonal to the radius vector rj at the position of the source in the (/’, m’) plane. 


follows that the effect of the time averaging is to produce a circumferential smearing 
similar to that resulting from the receiving bandwidth but orthogonal to it. If we 
express positions in the (l,m) plane in terms of the radial coordinates (r’, Y’) 
shown in Fig. 6.14a, the image obtained from the averaged data can be expressed 
in terms of the sky brightness I(r’, Y’) by 


LAr y’) = 


WeTa/2 
J Iw) dw! | x x bol, Y’ , (6.79) 


Wela J—oweTa/2 


where bo is the synthesized beam. 

The fractional decrease in the peak response to the point source is most easily 
considered in the (l,m) plane. With an east-west baseline, the contours of the 
synthesized beam are approximately circular in the (/’,m’) plane, as long as the 
observing time is approximately 12 h, which results in spacing loci in the form 
of complete circles in the (u’,v’) plane. If we assume that the synthesized beam 
can be represented by a Gaussian function, as in the calculations for the bandwidth 
effect, the curve for the rectangular bandwidth in Fig.6.12 can also be used for 
the averaging effect. In one case, the spreading function is radial and of width 
rı Av/vo, and in the other, it is circumferential and of width T Wea. Thus, for the 
averaging effect, we can replace rı Av/6,vo in Eq. (6.75) and Fig. 6.12 (solid curve) 


by ri @eTa/6;,, noting that r, = via + m sin? $y and 6/, the synthesized beamwidth 


in the (/’, m’) plane, is equal to the east-west beamwidth in the (l, m) plane. Hence, 
for the decrease in the response to a point source resulting from averaging, using 
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Eq. (6.75), we can write 


1 


0; Ti OeTa 
Ra = 1.0645——— erf | 0.8326 n (6.80) 


/ 
I @eTa ; 


Generally, one chooses Ta so that R4 is only slightly less than unity at any point in 
the image, in which case we can approximate the error function by the integral of 
the first two terms in the power series for a Gaussian function: 


1 (es 
R, > 1 — = {| ——_ 
3 o; 


2 
) (Ê + mj sin? 89) . (6.81) 
This formula can be used for checking that T4 is not too large. 

Two aspects of the behavior predicted by Eq. (6.81) should be mentioned. First, 
if the source is near the m’ axis and at a low declination, the averaging has very 
little effect. This is because the ridges of the sinusoidal corrugations of the visibility 
function then run approximately parallel to the u’ axis, and in the transformation 
u’ = ucosec ĝo, the period of the variations in the v direction is expanded by a large 
factor. In comparison, the arc through which any baseline vector moves in time Ty is 
small, and hence, the averaging has only a small effect on the visibility amplitude. 
Second, for a source on the /’ axis, R, is independent of ôo. In this case, the ridges 
of the corrugations run parallel to the v axis, and the expansion of the scale in the v 
direction has no effect on the sinusoidal period. 

For arrays that contain baselines other than east-west, the centers of the 
corresponding loci in the (u’, v’) plane are offset from the origin, as in Fig. 6.13a, 
and a time offset is no longer equivalent to a simple rotation of axes. However, this 
may not increase the smearing of the visibility, so the effect may be no worse than 
for an east—west array with baselines of similar lengths. 


6.5 Speed of Surveying 


The requirement to maximize the efficiency of use of large instruments requires 
consideration of the best procedures for surveying, i.e., searching large areas of the 
sky for radio sources of various types including transient sources. In the frequency 
range below about 2 GHz, four of the key science applications of the proposed 
Square Kilometre Array that require imaging of a significant fraction of the sky are 
as follows. 


1. Searching for pulsars in binary combinations with neutron stars or black holes, 
for gravitational studies. 

2. Measurement of Faraday rotation in very large numbers of radio galaxies to 
determine the structure of galactic and intergalactic magnetic fields. 
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3. Imaging of very large numbers of HI galaxies out to redshifts of z ~ 1.5 to study 
galactic evolution and provide further constraints on the nature of dark energy. 
4. Detection of transient events such as afterglows of gamma-ray bursts. 


The choice of parameters for optimization of speed in survey observations is 
not the same as for optimization of sensitivity in targeted studies (Bregman 2005). 
Consider first the case of targeted observations of individual continuum sources that 
have angular dimensions small compared with a station’ beam. We can adapt the 
expression for the rms noise [Eq. (6.62)] as a measure of the minimum detectable 
flux density Smin Observed with two stations in a time t: 


2kT; 


AJA) 


where A is here the collecting area of a station, equal to Ang of Eq. (6.62). For an 
array with ns stations, there are n,(n, — 1)/2 ~ n2/2 correlated pairs of signals, so 
the right side of Eq. (6.82) is multiplied by a factor J2/ ns. The observing speed, 
i.e., the number of observations per unit time, is 


Smin = (6.82) 


AP Avs? n? 


Next, consider the case of a survey in which we are concerned with the speed of 
coverage of a specified solid angle of sky down to a sensitivity level Smin. Since A is 
the area of a station, the solid angle of a station beam is A7/A sr, where A is the 
wavelength. If each station forms nsp simultaneous beams, the instantaneous field of 
view is 


F, = ?ng/A. (6.84) 


The reciprocal of the time t required to cover solid angle F, down to flux density 
level Smin is given by Eq. (6.83), and the corresponding survey speed is 


2 2 39 
MAAVS. insb 


F,/t = 
‘a BET? 


sr per unit time . (6.85) 


For surveys to detect spectral line features, Av in Eqs. (6.83) and (6.85) represents 
the bandwidth of the line. Then, if it is necessary to search in frequency, the 


7For an array with a large total collecting area, it may be more practical to use a large number 
of small antennas rather than a smaller number of large antennas. To limit the number of cross- 
correlated signal pairs, the small antennas are located in groups. These groups are commonly 
referred to as stations, and the signals from the antennas at a station are combined to provide 
a number of beams within the main beam of the individual antennas. For pairs of stations, the 
signals from corresponding beams are cross-correlated to provide the visibility data. 
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bandwidth of the receiving system can be included as an additional factor in the 
expression for the speed in Eq. (6.85). 

Comparison of Eqs.(6.83) and (6.85) shows the effect of the field-of-view 
dependence in the survey case. The survey speed is proportional to the number 
of simultaneous beams and is less strongly dependent on the station aperture 
area A. The wavelength-squared factor results from the increased beamwidth with 
decreasing frequency, but the benefit of lower frequency (increasing À) on the survey 
speed applies only so long as the effect of the galactic background radiation on the 
system temperature is small. From the galactic background model of Dulk et al. 
(2001), the brightness temperature in the range 10—1,000 MHz is approximately 
proportional to v~*°, so for frequencies at which this background is the dominant 
contributor to T;, the frequency dependence of the survey speed is approximately 
proportional to v. For directions that are not close to the galactic plane, the 
background temperature is about 20 K at 500 MHz and 2 K at 1 GHz, so if the 
receiver contribution to T; is ~ 20 K, there is a broad maximum in survey speed 
between these two frequencies. 

Note that the discussion above involves the assumption that the sensitivity is 
limited only by the system noise. If dynamic range is the limiting factor, then 
the density of the (u, v) coverage, which improves with increasing ns and t, may 
become the most important consideration. In either case, performance improves 
with increasing number of stations. 

Survey speed can be increased by increasing the number of stations as well as the 
number of station beams. However, the size of the correlator system for the full array 
is proportional to n? and to nsp, so increasing the number of stations, or the number 
of station beams, requires an increase in the size of the correlator. Increasing the 
station aperture A is likely to require adding more antennas to the station subarray 
and thus increases the station beamforming hardware. The only way of increasing 
the observing speed that does not increase the signal-processing requirements is 
reducing the system temperature 7,. However, the complexity of phased-array feeds 
for the formation of multiple beams from a single parabolic antenna can degrade 
the system temperature. If cryogenic cooling in necessary, the required cooling 
capacity is considerably greater for multiple-beam systems than for single-beam 
ones. Thus, optimization of the array performance for a given overall cost requires 
a broad consideration of the performance of various parts of the receiving system. 


Appendix 6.1 Partial Rejection of a Sideband 


In an SSB system using a mixer as the input stage, the unwanted (image) sideband 
may be rejected by one of several schemes. These include use of a waveguide 
filter, a Martin—Puplett interferometer (Martin and Puplett 1969; Payne 1989), 
a tuned backshort, or a sideband-separating configuration of two mixers (as in 
Appendix 7.1). Practical considerations, particularly at millimeter wavelengths, can 
limit the effectiveness of the rejection of the image sideband. Let the response to the 
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Fig. A6.1 Vectors in the 
complex plane representing 
the parameters in Eqs. (A6.1) 
and (A6.2). The constant gain 
factor Gmn is omitted. If p is 
known and Co and C,,/2 are 
measured, Eq. (A6.3) gives 
the optimum estimate of V. 


Im 


image sideband, in terms of the power gain of the receiver, be p times the response 
to the wanted (signal) sideband, where 0 < p < 1. In practice, p could be as large 
as ~ 1/10. 

In the case of spectral line observation, where the wanted line occurs only in 
the signal sideband, the effect of the noise introduced by the image sideband is 
to increase the rms noise at the correlator output by a factor (1 + p). Thus, the 
sensitivity is correspondingly reduced (Fig. A6.1). 

In the case of a continuum observation, the image sideband also introduces a 
component of signal and noise at the correlator. Assume that the visibility is the 
same in both sidebands, the fringes are stopped, and 2/2 phase switching of the first 
LO allows measurement of the complex visibility. A complex correlator is used, and 
for simplicity, we consider that the instrumental phase is adjusted so that the line AB 
in Fig. 6.5b is coincident with the real axis. We can represent the complex correlator 
output with zero phase shift of the LO as 


Co = Gin (V + pV") , (A6.1) 

and with the 2/2 phase switch as 
Cx/2 = Gan GV —jpV") . (A6.2) 
Here, Gmn is the gain in the signal sideband, so p Gmn is the gain in the image 
sideband. Note that in the expression for C,/2, the j factors have opposite signs for 
the two sidebands, because the 2/2 phase shift causes the corresponding vectors in 


the complex plane to rotate through 7/2 in opposite directions, as in Fig. A6.1. The 
optimum estimate of the visibility is then found to be 


(Co —jCn2) + 


pP ; * 
C Ce š A6.3 
i+p! o +jCx/2) | ( ) 


1 1 
Y= 
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The first term within the square brackets represents the response of the signal 
sideband, and the second term represents the response of the image. The total noise 
power delivered to the correlator input [i.e., the sum of the two terms in the square 
brackets in Eq. (A6.3)] is proportional to (1 + p), so the noise associated with the 
first term in the square brackets is proportional to (1 + p)/(1 + p°). Similarly, for 
the second term, the associated noise is proportional to p(1 + p)/(1 + p°). Thus, the 
total noise in the estimate of V from Eq. (A6.3) is proportional to (1+ )?/(1 + p°). 
In terms of the rms noise, the sensitivity is proportional to the square root of the 


reciprocal of the last term, i.e., y (1 + p2)/(1 + p). For p ~ 1/10 or less, the p? 
term is very small, and the sensitivity degradation factor is approximately (1 + p)~! 


(Thompson and D’ Addario 2000). 
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Chapter 7 
System Design 


In this chapter, we consider certain aspects of the design of the interferometric 
system in more detail. This discussion primarily involves parts of the system where 
the signals are in analog form. The trend in technology has been to convert signals 
as early as possible in the signal chain, following the antennas, into digital form to 
facilitate data handling, avoid low-level distortions, and generally take advantage of 
the rapid progress in the development of digital equipment and computers. Three 
key items are discussed: (1) low noise amplification of signals at the antenna output 
to minimize the effect of additive noise, (2) phase-stable transmission systems 
that allow the transfer of reference timing and phase signals from the central 
communications hub of the instrument to the antennas, and (3) the synchronous 
phase switching systems needed to eliminate spurious responses in the correlator 
output. The analysis here leads to specification of tolerances on system parameters 
that are necessary to achieve the goals of sensitivity and accuracy. 


7.1 Principal Subsystems of the Receiving Electronics 


Optimum techniques and components for implementation of the electronic hardware 
vary continuously as the state of the art advances, and descriptions in the literature 
provide examples of the practical techniques current at various times: see, for 
example, Read (1961), Elsmore et al. (1966), Baars et al. (1973), Bracewell et al. 
(1973), Wright et al. (1973), Welch et al. (1977, 1996), Thompson et al. (1980), 
Batty et al. (1982), Erickson et al. (1982), Napier et al. (1983), Sinclair et al. (1992), 
Young et al. (1992), Napier et al. (1994), de Vos et al. (2009), Perley et al. (2009), 
Wootten and Thompson (2009), and Prabu et al. (2015). The earlier papers in this 
list are mainly of interest from the viewpoint of the development of the technology. 
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Fig. 7.1 Basic elements of the receiving system of a synthesis array. Here, the received signals are 
converted to an intermediate frequency (IF), digitized, and then transmitted by optical fiber to the 
central location for the derivation of visibility data. In systems of earlier design, and some smaller 
systems, the IF signal is transmitted to the central location in analog form and then digitized. LO 
indicates a local oscillator (i.e., usually one within the receiving system). 


Figure 7.1 shows a simplified schematic diagram of the receiving system 
associated with one antenna of a synthesis array. Note that digitization of the signals 
is introduced as early as possible in the system, thus allowing most of the signal 
processing to be implemented digitally. In very early interferometers, there was no 
digitization, and the output was displayed on a chart recorder. In the original VLA 
system, the digitization occurred at the central location just before the delay and 
correlator processing. In the later VLA system (Perley et al. 2009), the signals are 
digitized at the antenna stations. 


7.1.1 Low-Noise Input Stages 


In radio astronomy receivers, minimizing the noise temperature usually involves 
cryogenic cooling of the amplifier or mixer stages from the input up to a point 
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at which noise from succeeding stages is unimportant. The low-noise input stages 
are often packaged with a cooling system, and sometimes also a feed horn, in a 
single package often referred to as the front end. The active components are usually 
transistor amplifiers or, for millimeter wavelengths, SIS (superconductor—insulator— 
superconductor) mixers followed by transistor amplifiers. For descriptions, see, 
for example, Reid et al. (1973), Weinreb et al. (1977a), Weinreb et al. (1982), 
Casse et al. (1982), Phillips and Woody (1982), Tiuri and Räisänen (1986), Payne 
(1989), Phillips (1994), Payne et al. (1994), Webber and Pospieszalski (2002), and 
Pospieszalski (2005). 

In discussing the level of noise associated with a receiver, we begin by con- 
sidering the case in which the Rayleigh-Jeans approximation suffices. This is the 
domain in which hv/kT « 1, where h is Planck’s constant and T is the temperature 
of the thermal noise source involved. As noted in the discussion following Eq. (1.1), 
this condition can be written as v (GHz) « 20T, where T is the system noise 
temperature in kelvins. It is convenient to specify noise power in terms of the 
temperature of a resistive load matched to the receiver input. In the Rayleigh—Jeans 
approximation, noise power available at the terminals of a resistor at temperature T 
is kTAv, where k is Boltzmann’s constant and Av is the bandwidth within which 
the noise is measured (Nyquist 1928). One kelvin of temperature represents a power 
spectral density of (1/k) W Hz7!. The receiver temperature Tp is a measure of the 
internally generated noise power within the system and is equal to the temperature of 
a matched resistor at the input of a hypothetical noise-free (but otherwise identical) 
receiver that would produce the same noise power at the output. The system 
temperature, Ts, is a measure of the total noise level and includes, in addition to Tr, 
the noise power from the antenna and any lossy components between the antenna 
and the receiver: 


Ts = T} + (L—1)T; + Lr, (1.1) 


where T4 is the antenna temperature resulting from the atmosphere and other 
unwanted sources of noise, L is the power loss factor of the transmission line 
from the antenna to the receiver [defined as (power in)/(power out)], and Tz is the 
temperature of the line. In defining the noise temperature of the receiver, we should 
note that in practice, a receiver is always used with the input attached to some 
source impedance that is itself a source of noise. The noise at the receiver output 
thus consists of two components, the noise from the source at the input, which is 
the antenna and transmission line in Eq. (7.1), and the noise generated within the 
receiver. 


7.1.2 Noise Temperature Measurement 


The noise temperature of a receiver is often measured by the Y-factor method. The 
thermal noise sources used in this measurement are usually impedance-matched 
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resistive loads connected to the receiver input by waveguide or coaxial line. The 
receiver input is connected sequentially to two loads at temperatures Thot and Teoia. 
The measured ratio of the receiver output powers in these two conditions is the 
factor Y: 


T Tho 
— R + Lhot ; (7.2) 
Tr F Toolid 
and thus, 
Thot = YTeold 
Tr = —————_ 73 
= (7.3) 


Commonly used values are Tho = 290 K (ambient temperature) and Teola ~ 77 K 
(liquid nitrogen temperature). For very precise measurements of Tp, it is important 
to note that the boiling point of liquid nitrogen depends on the ambient pressure. 
The receiver temperature can be expressed in terms of the noise temperatures of 
successive stages through which the signal flows [see, e.g., Kraus (1986)]: 


Tr = Tri + TRGI) + Tr3(GiG2)"'| +++ . (7.4) 


Here Tp; is the noise temperature of the ith receiver stage, and G; is its power gain. 
If the first stage is a mixer instead of an amplifier, G; may be less than unity, and 
the second-stage noise temperature then becomes very important. 

For cryogenically cooled receivers for millimeter and shorter wavelengths, the 
Rayleigh—Jeans approximation can introduce significant errors. The power spectral 
density (power per unit bandwidth) of the noise is no longer linearly proportional 
to the temperature of the radiator or source. The ratio h/k is equal to 0.048 K 
per gigahertz, so if, for example, T = 4 K (liquid helium temperature), then 
hv/kT = 1 for v = 83 GHz. Thus, quantum effects become important as frequency 
is increased and temperature decreased. Under these conditions, the noise power 
per unit bandwidth divided by k provides an effective noise temperature that can be 
used in noise calculations, instead of the physical temperature. Two formulas are in 
use that give the effective temperature for a thermal source when quantum effects 
become important. One is the Planck formula and the other the Callen and Welton 
formula (Callen and Welton 1951). The effective noise temperatures for a waveguide 
carrying a single mode and terminated in a thermal load, or for a transmission line 
terminated in a resistive load, given by the two formulas are as follows: 


ehv/kT =] 


hv 
Tplanck = T | (7.5) 


a hv 
_ KE 
Tcaw = T act | a (7.6) 
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Fig. 7.2 Noise temperature vs. physical temperature for blackbody radiators at 230 GHz, accord- 
ing to the Rayleigh—Jeans, Planck, and Callen and Welton formulas. Also shown (broken lines) are 
the differences between the three radiation curves. The Rayleigh—Jeans curve converges with the 
Callen and Welton curve at high temperature, while the Planck curve is always hv/2k below the 
Callen and Welton curve. From Kerr et al. (1997). 


where T is the physical temperature. From Eqs. (7.5) and (7.6), we have 


hv 
Teew = Tptanck + — - (7.7) 


2k 

The Callen and Welton formula is equal to the Planck formula with an additional 
term, hv/2k, which represents an additional half-photon. This half-photon is the 
noise level from a body at absolute zero temperature and is referred to as the 
zero-point fluctuation noise. Figure 7.2 shows the relationships between physical 
temperature and noise temperature corresponding to the Rayleigh—Jeans, Planck, 
and Callen and Welton formulas, for a frequency of 230 GHz. Note that for the case 
of hv/kT < 1, we can put exp(hv/kT) — 1 ~ (Av/kT) + 5(hv/kT)’, in which 
case the Callen and Welton formula reduces to the Rayleigh—Jeans formula, but the 
result from the Planck formula is lower by hv/2k. 

When using Eq. (7.3) to derive the noise temperature of a receiver, the values of 
T hot and Teora Should be the noise temperatures derived from the Planck or Callen 
and Welton formulas, not the physical temperatures of the loads (except in the 
Rayleigh—Jeans domain). Thus, for the Planck formula, we can write 


Thot(Planck) a YT cold(Planck) 


Foi (7.8) 


Tr (Planck) = 
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and a similar equation for the Callen and Welton formula. From Eqs. (7.4), (7.5), 
and (7.6), we obtain 


hv 
TrPlanck) = Tr(c&w) + = (7.9) 


2k ` 

In using any measurement of receiver noise temperature, it is important to know 
whether, in deriving it, the Planck formula, the Callen and Welton formula, or the 
physical temperature of the loads (i.e., the Rayleigh—Jeans approximation) was used. 
If the noise temperatures of the individual components are derived from the physical 
temperatures using the Callen and Welton formula, the temperature sum will be 
greater by hv/2k than if the Planck formula were used; see Eq. (7.7). However, if 
the Callen and Welton formula is used to derive the receiver noise temperature, the 
result will be less by 4/2k than if the Planck formula were used; see Eq. (7.9). Thus 
the system temperature, which is the sum of the input temperature and the receiver 
temperature, will be the same whichever of the two formulas is used. However, 
to avoid confusion, it is important to use one formula or the other consistently 
throughout the derivation of the noise temperatures. 

Differing opinions have been expressed on the nature of the zero-point fluctuation 
noise, and whether it should be considered as originating in the load connected to 
the receiver or in the receiver input stages; see, for example, Tucker and Feldman 
(1985), Zorin (1985), and Wengler and Woody (1987). At frequencies at which 
quantum effects become most important, the usual type of input stage in radio 
astronomy receivers is the SIS mixer, for which the quantum theory of operation is 
given by Tucker (1979). For a summary from various authors of some conclusions 
relevant to noise temperature considerations, see Kerr et al. (1997) and Kerr (1999). 

To recapitulate: The radiation level predicted by the Callen and Welton formula 
is equal to the Planck radiation level plus the zero-point fluctuation component 
hv/2. The latter component is attributable to the power from a blackbody or 
matched resistive load at absolute zero temperature. An amplifier noise temperature 
derived using the Callen and Welton formula to interpret the measured Y factor is 
lower than that derived using the Planck formula by hv/2k. However, an antenna 
temperature obtained using the Callen and Welton formula is higher by hv/2k than 
the corresponding Planck formula value. The system temperature, which is the sum 
of the noise temperature and the antenna temperature, is the same in either case. 
Since the system temperature determines the sensitivity of a radio telescope, these 
details may seem unimportant. However, in procuring an amplifier or mixer for a 
receiver input stage, it is important to know how the noise temperature is specified. 

In addition to the noise generated in the electronics, the noise in a receiving 
system contains components that enter from the antenna. These components arise 
from cosmic sources, the cosmic background radiation, the Earth’s atmosphere, 
the ground, and other objects in the sidelobes of the antenna. The opacity of the 
atmosphere, from which the atmospheric contribution to the system noise arises, is 
discussed in Chap. 13. 
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7.1.3 Local Oscillator 


As explained in the previous chapter, local oscillator (LO) signals are required at 
the antenna locations and sometimes at other points along the signal paths to the 
correlators. The corresponding oscillator frequencies for different antennas must 
be maintained in phase synchronism to preserve the coherence of the signals. The 
phases of the oscillators at corresponding points on different antennas need not 
be identical, but the differences should be stable enough to permit calibration. 
Maintaining synchronism at different antennas requires transmitting one or more 
reference frequencies from a central master oscillator to the required points, where 
they may be used to phase-lock other oscillators. The frequencies required at the 
mixers can then be synthesized. 

Special phase shifts are required at certain mixers to implement fringe rotation 
(fringe stopping), as described in Sect. 6.1, and to implement phase switching, as 
described in Sect. 7.5. Often these can best be synthesized by digital techniques, 
which can provide a signal at a frequency of, say, a few megahertz that contains the 
required frequency offsets and phase changes. These can be transferred to the LO 
frequency by using the synthesized signal as a reference frequency in a phase-locked 
loop. 


7.1.4 IF and Signal Transmission Subsystems 


After amplification in the low-noise front-end stages, the signals pass through 
various IF amplifiers and a transmission system before reaching the correlators. 
Transmission between the antennas and a central location can be effected by means 
of coaxial or parallel-wire lines, waveguide, optical fibers, or direct radiation by 
microwave radio link. Cables are often used for small distances, but for long 
distances, the cable attenuation may require the use of too many line amplifiers, and 
optical fiber, for which the transmission loss is much lower, is generally preferred. 
Low-loss TEo|-mode waveguide (Weinreb 1977b; Archer et al. 1980) was used in 
the construction of the original VLA system, which preceded the development of 
optical fiber by a few years. Optical fiber is now used for the Very Large Array 
(VLA)! (Perley et al. 2009). Cable or optical fiber can be buried at depths of 1- 
2 m to reduce temperature variations. Bandwidths of signals transmitted by cables 
are usually limited to some tens or hundreds of megahertz by attenuation, and 
radio links are similarly limited by available frequency allocations. For very wide 
bandwidths, optical fibers offer the greatest possibilities. 

In the (mostly earlier) systems in which the signals are transmitted from 
the antennas to the central location in analog form, phase errors resulting from 


With the upgraded receiving system. 
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temperature effects in filters, and delay-setting errors, can be minimized by using the 
lowest possible intermediate frequency (IF) at this point. Accordingly, the final IF 
amplifiers may have a baseband response defined by a lowpass filter.” The response 
at the low-frequency end falls off at a frequency that is a few percent of the upper 
cutoff frequency. 


7.1.5 Optical Fiber Transmission 


The introduction of optical fiber systems provided a very great advance in transmis- 
sion capability for broadband signals over long distances. Signals are modulated 
onto optical carriers, commonly in the wavelength range 1300-1550 nm, and 
transmitted along glass fiber. The fiber attenuation is a minimum of approximately 
0.2 dB km! near 1550 nm and is about 0.4 dB km™! at 1300 nm. These values are 
much lower than can be obtained in radio frequency transmission lines. In the fiber, a 
glass core is surrounded by a glass cladding of lower refractive index, so light waves 
launched into the core at a small enough angle with respect to the axis of the fiber 
can propagate by total internal reflection. If the inner-core diameter is approximately 
50 um, a number of different modes can be supported. These modes travel with 
slightly different velocities, which results in a limitation in performance of this 
multimode fiber. If the core is reduced to approximately 10 um in diameter, only 
the HE;; mode propagates. Single-mode fiber of this type is required for the longest 
distances and/or the highest frequencies and bandwidths. At 1550 nm, an interval 
of 1 nm in wavelength corresponds to a bandwidth of approximately 125 GHz. The 
low attenuation and the bandwidth capacity facilitate the use of wide bandwidths 
and long baselines in linked-element arrays. Signals can be transmitted in analog 
form or digitized and transmitted as pulse trains. Design of a fiber transmission 
system involves the characteristics of the lasers that generate the optical carriers and 
the detectors that recover the modulation, as well as the characteristics of the fiber. 
For further information, see, for example, Agrawal (1992), Borella et al. (1997), and 
Perley et al. (2009). 

In practice, the bandwidth and distance of the transmission are limited by the 
noise in the laser that generates the optical signal at the transmitting end of the 
fiber, and the noise in the diode demodulator and the amplifier at the receiving end. 
To avoid degradation of the sensitivity in analog transmission, the power spectral 
density of the signal (measured in W Hz~!) must be greater than the power spectral 
density of the noise generated in the transmission system by ~ 20 dB for most radio 
astronomy applications. However, the total signal power is limited by the need to 
avoid nonlinearity of the response of the modulator or demodulator. The result is a 
limit on the bandwidth of the signal, since for signals with a flat spectrum, the power 


?In some cases, an image rejection mixer (see Appendix 7.1) is used for the conversion to baseband, 
but the suppression of the unwanted sideband may then be no greater than 20-30 dB. 
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is proportional to the bandwidth. In practice, a single transmitter and receiver pair 
can operate with a bandwidth of 10-20 GHz for transmission distances of some tens 
of kilometers. Optical amplifiers, which most commonly operate at wavelengths 
near 1550 nm, can be used to increase the range of transmission. 

In the modulation process, the power of the carrier is varied in proportion to the 
voltage of the signal. Because of this, the effect of small unwanted components in 
fiber transmission systems is greatly reduced. Consider, for example, a small com- 
ponent of the optical signal resulting from a reflection within the fiber. If the optical 
power of the reflected component is x dB less than that of the main component, 
then after demodulation at the photodetector, the signal power contributed by the 
reflected component is 2x dB less than that from the main optical component. This 
also applies to small unwanted effects resulting from finite isolation of couplers, 
isolators, and other elements. Variations in the frequency response resulting from 
standing waves in microwave transmission lines are significantly less in optical fiber 
than in cable. 

A feature that must be taken into account in applications of optical fiber is the 
dispersion in velocity, D, usually specified in ps(nm - km)~!. The difference in 
the time of propagation for two optical wavelengths that differ by AA traveling 
a distance £ in the fiber is DAAL. Figure 7.3 shows the dispersion for two types 
of fiber. Curve 1 is for a type of fiber widely used in early applications, and 
curve 2 represents a design in which the zero-dispersion wavelength is shifted to 
coincide approximately with the minimum-attenuation wavelength of 1550 nm. This 
optimization of the performance at 1550 nm is achieved by designing the fiber so 
that the dispersion of the cylindrical waveguide formed by the core of the fiber 
cancels the intrinsic dispersion of the glass at that wavelength. 

Consider a spectral component, at frequency Vm, of a broadband signal that is 
modulated onto an optical carrier. Amplitude modulation of the signal results in 
sidebands spaced +v,, in frequency with respect to the carrier. Because of the 
velocity dispersion, the two sidebands and the carrier each propagate down the 
fiber with slightly different velocities and thus exhibit relative offsets in time at 
the receiving end. Such time offsets result in attenuation of the amplitude of the 
high-frequency components of analog signals and in broadening of the pulses used 
to represent digital data. Thus, for both analog and digital transmission, dispersion 


Fig. 7.3 Dispersion D in 
single-mode optical fiber of 10 
two different designs, as a 
function of the optical 
wavelength. 


DISPERSION 
[ps/(km.nm)] 


1.1 1.2 1.3 1.4 1.5 1.6 1.7 
WAVELENGTH (um) 


264 7 System Design 


as well as noise can limit the bandwidth x distance product. An analysis of the effect 
of dispersion on analog signals is given in Appendix 7.2. 


7.1.6 Delay and Correlator Subsystems 


The compensating delays and correlators can be implemented by either analog or 
digital techniques. An analog delay system may consist of a series of switchable 
delay units with a binary sequence of values in which the delay of the nth unit is 
2"-!t), where to is the delay of the smallest unit. Such an arrangement, with N 
units, provides a range of delay from zero to (2% — 1)z9 in steps of tọ. For delays up 
to about 1 us, lengths of coaxial cable or optical fiber have been used. The design 
of analog multiplying circuits for correlators has been discussed by Allen and Frater 
(1970). An example of a broadband analog correlator is described by Padin (1994). 
However, the development of digital circuitry capable of operating at high clock 
frequencies has led to the general practice of digitizing the IF signal so that the 
delay and correlators are generally implemented digitally, as discussed in Chap. 8. 


7.2 Local Oscillator and General Considerations of Phase 
Stability 


7.2.1 Round-Trip Phase Measurement Schemes 


Synchronizing of the oscillators at the antennas can be accomplished by phase- 
locking them to a reference frequency that is transmitted out from a central master 
oscillator. Buried cables or fibers offer the advantage of the greatest stability of the 
transmission path. At a depth of 1—2 m, the diurnal temperature variation is almost 
entirely eliminated, but the annual variation is typically attenuated by a factor of 
2-10 only. For a discussion of temperature variation in soil as a function of depth, 
see Valley (1965). As an example, a 10-km-long buried cable with a temperature 
coefficient of length of 107° K~! might suffer a diurnal temperature variation of 
0.1 K, resulting in a change of | cm in electrical length. A similar variation would 
occur in a 50-m length of cable running from the ground to the receiver enclosure 
on an antenna and subjected to a diurnal temperature variation of 20 K. Rotating 
joints and flexible cables can also contribute to phase variations. 

Path length variations can be determined by monitoring the phase of a signal of 
known frequency that traverses the path. It is necessary for the signal to travel in two 
directions, that is, out from the master oscillator and back again, since the master 
provides the reference against which the phase must be measured. This technique 
is described as round-trip phase measurement. Correction for the measured phase 
changes can be implemented in hardware by using a phase shifter driven by the 
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measurement system, or in software by inserting corrections in the data from the 
correlator, either in real time or during the later stages of data analysis. It is also 
possible to generate a signal in which the phase changes are greatly reduced by 
combining signals that travel in opposite directions in the transmission line. As 
an illustration of the last procedure, consider a signal applied to the near end 
of a loss-free transmission line that results in a voltage Vo cos(2mvf) at the far 
end. At a distance £, measured back from the far end, the outgoing signal is 
Vi = Vocos2mv(t + £/v), where v is the phase velocity along the line. Suppose 
that the signal is reflected from the far end without change in phase. At the same 
point, distant £ from the far end, the returned signal is Vz = Vo cos 2m v(t — £/v), 
and the total signal voltage is 


Q2nve 
Vi + V2 = 2Vo cos (27 vf) cos ; (7.10) 
v 


The first cosine function in Eq. (7.10) represents the radio frequency signal, the 
phase of which (modulo 7) is independent of £ and of line length variations. The 
second cosine function is a standing-wave amplitude term. Such a system cannot 
easily be implemented in practice because of attenuation and unwanted reflections, 
and thus more complicated schemes have evolved. In what follows, we consider 
cable transmission, although the basic principles are applicable to other systems. 
Some general considerations, including the use of microwave links, are given by 
Thompson et al. (1968). 


7.2.2 Swarup and Yang System 


Several different round-trip schemes have been devised as instruments have devel- 
oped, and one of the earliest of these was by Swarup and Yang (1961). A system 
based on this scheme is shown in Fig. 7.4. Part of the outgoing signal is reflected 
from a known reflection point at an antenna, and variation in the path length to the 
reflector is monitored by measuring the relative phase of the reflected component at 
the detector. The phase of the reflected signal is compared with that of a reference 
signal. The phase of the latter is variable by means of a movable probe that samples 
the outgoing signal. Since many other reflections may occur in the transmission line, 
it is necessary to identify the desired component. To do this, a modulated reflector, 
for example, a diode loosely coupled to the line, is used. This is switched between 
conducting and nonconducting states by a square wave voltage, and a synchronous 
detector is used to separate the modulated component of the reflected signal. 

An increase A£ in the length of the transmission line is detected as a corre- 
sponding movement of 24£ in the probe position for the null. It results in an 
increase of 27 A€v,/v in the phase of the frequency vı at the antenna, where v 
is the phase velocity in the line. The corresponding changes in LO phases and IF 
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Fig. 7.4 System for measuring variations in the electrical path length in a transmission line, based 
on the technique of Swarup and Yang (1961). The output of the synchronous detector is a sinusoidal 
function of the difference between the phases of the reference (outgoing) and reflected components 
at the detector. A null output is obtained when these signal phases are in quadrature, and the 
position of the probe for a null is thus a measure of the phase of the reflected signal. Because of 
the isolator in the line, the probe samples only the outgoing component of the signal. 


phases transmitted over the same path can be calculated and applied as a correction 
to the visibility phases. 


7.2.3 Frequency-Offset Round-Trip System 


A second scheme, shown in Fig.7.5, is one in which the round-trip phase is 
measured directly. The signals traveling in opposite directions are at frequencies vı 
and v2 that differ by only a small amount, but enough to enable them to be separated 
easily. This type of system is widely used, and we examine its performance in some 
detail. Note that although directional couplers or circulators allow the signals at the 
same frequency but going in opposite directions in the line to be separated, the signal 
from the unwanted direction is suppressed by only 20-30 dB relative to the wanted 
one. An unwanted component at a level of —30 dB can cause a phase error of 1.8°. 
However, the frequency offset enables the signals to be separated with much higher 
isolation. 

An oscillator at frequency v2 at an antenna is phase-locked to the difference 
frequency of signals at vı and vj — v2, which travel to the antenna via a transmission 
line. The difference frequency vı — v2 is small compared with vı and v2. The 
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Fig. 7.5 Phase-lock scheme for the oscillator v2 at the antenna. Frequencies vı and vı — v2 are 
transmitted to the antenna station where they provide the phase reference to lock the oscillator. vı 
and v are almost equal, so vı — v2 is small. A signal at frequency vz is returned to the central 
station for the round-trip phase measurement. 


frequency v is returned to the master oscillator location for the round-trip phase 
comparison. 

At the antenna, the phases of the signals at frequencies v; and v; — v relative to 
their phases at the central location are 27 v,L/v and 27x (v; — v2)L/v, where L is the 
length of the cable. The phase of the v2 oscillator at the antenna is constrained by 
a phase-locked loop to equal the difference of these phases, that is, 277v2L/v. The 
phase change in the v» signal in traveling back to the central location is 27v2L/v, 
and thus the measured round-trip phase (modulo 27) is 4r vL/v. Now suppose 
that the length of the line changes by a small fraction, 6. The phase of the oscillator 
v at the antenna relative to the master oscillator changes to 27v.L(1 + 6)/v. The 
required correction to the v, oscillator is just half the change in the measured round- 
trip phase. The problem that arises is that several effects, including reflections and 
velocity dispersion in the transmission line, can cause errors in the round-trip phase. 
Such errors result in phase offsets of the oscillator at the antenna, which is not 
serious if the offsets remains constant. However, in practice, it is likely to vary with 
ambient temperature. The largest error usually results from reflections, and control 
of this error places an upper limit on the difference frequency vı — v2. We now 
examine this limit. 


268 7 System Design 


Consider what happens if reflections occur at points A and B separated by a 
distance £ along the line as in Fig. 7.5. The complex voltage reflection coefficients 
at these points are p4 and pg, and their values will be assumed to be the same 
at frequencies vı and v2. Signals vı and v2, after traversing the cable, include 
components that have been reflected once at A and once at B. The coefficients p4 and 
pg are sufficiently small that components suffering more than one reflection at each 
point can be neglected. For the frequency v; arriving at the antenna, the amplitude 
(voltage) of the reflected component relative to the unreflected one is 


A = |pal|pp|10-“ , (7.11) 


where a is the (power) attenuation coefficient of the cable in decibels per unit length. 
Note that the attenuation in voltage is equal to the square root of the attenuation 
in power. The phase of the reflected component relative to the unreflected one is 
(modulo 277) 


6, = 4rlvyyv !}+¢44+ bp, (7.12) 


where ġ4 and ġg are the phase angles of p4 and pg (that is, pa = |pale/*, etc.), and 
v is the phase velocity in the line. Figure 7.6 shows a phasor representation of the 
reflected and unreflected components and their phase 01. The reflected component 
causes the resultant phase to be deflected through an angle ¢; given by 


A sin 0; 


——. 7.1 
1+ Acos 6, 13) 


Qı = tan ġı = 


Z 


Reflected component, 


amplitude A 
Resultant 
Unreflected 
component, 
unit 
amplitude 


Fig. 7.6 Phasor diagram of components at frequency v; transmitted by the cable. 
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Similarly, the phase of the frequency v2 is deflected through an angle ¢2, given by 
equations equivalent to Eqs. (7.12) and (7.13) with subscript 1 replaced by 2. 

With the reflection effects represented by ġı and ġ2, the round-trip phase for a 
line of length L is 


4nvoLv } +o, + Q2. (7.14) 


If the line length increases uniformly to L(1 + f), the angles ¢; and ġ vary in 
a nonlinear manner with £ and become ¢; + dd, and ¢2 + d¢2, respectively. The 
round-trip phase then becomes 


AnvyLv'(1 + B) + b1 + ôi + $2 + êg. (7.15) 


(The effect of the reflection on the phase of the signal at frequency vı — vz has been 
omitted since vı — v2 is much smaller than vı or v2, and reflections for the relatively 
low frequency may be very small. Also, the rate of change of phase of vı — vz with 
line length is correspondingly small.) The applied correction for the increase in line 
length is half the measured change in round-trip phase: 


2nv2eBLv—! + (661 + ôg). (7.16) 


However, the exact correction would be equal to the change in the phase of vz at the 
antenna, which is 


2nvr2BLv | + êp. (7.17) 


Consequently, the phase correction is in error by 


(601 + 662) — 8¢2 = 5 (50 — d¢2) . (7.18) 


If vy and vz were equal, the phase error would be zero. It is possible therefore 
to specify a maximum allowable difference frequency in terms of the maximum 
tolerable error. 

The difference between the phase angles ¢; and ¢» is obtained from Eq. (7.13) 
as follows: 


0 
ġġ = Py — v2) 
vI 
_ 4nlu™! Acos 0i(1 + Acos 01) + 4r£v™! A? sin? 6 


(1 + 4 cos 01)? (1 = v2) - 


(7.19) 
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The reflected amplitude A must be much less than unity if phase errors are to be 
tolerable, so terms in A? can be omitted from the numerator in Eq. (7.19), and the 
denominator is approximately unity. Thus, 


oi — p2 ~ 4rlv A, — v2) cos 01 . (7.20) 


The variation of ¢; — ¢2 with line length is given by 


ð 
ôġı — b¢2 = Ptg O — 2) 
=Anvu!A [cos 6, — 0.1€a(1n10) cos 6; — 4r v7! £v; sin 61] 
x (vı — v2) Be. (7.21) 


The maximum values of the terms in square brackets in Eq. (7.21) are dominated 
by the third term, which is of the order of the number of wavelengths in the line. If 
the two smaller terms are neglected, we obtain the magnitude of the phase error as 
follows: 


1 
5 (81 — ôd) ~ 827v "| pallos BE? 10-¢ v1 (v1 — v) sin 0 . (7.22) 


The factor £2107%/° has a maximum value at 
£=20(a1n 10)7! . (7.23) 


This maximum occurs because for small values of £, the change in the angle 6 
with frequency or cable expansion is small, and for large values of £, the reflected 
component is greatly attenuated. The maximum value is equal to 


[ero | = 10.2107. (7.24) 


Curves of ¢710~°/!° are plotted in Fig.7.7 for various values of œ that 
correspond to good-quality cables. It is evident that reducing the attenuation in a 
cable increases the error in the round-trip phase correction in Eq. (7.22). 

The type of reflections that may be encountered depends on the type of transmis- 
sion line and how it is used. For example, consider a buried coaxial cable that runs 
along a set of stations used for a movable antenna. The principal cause of reflections 
in such a cable is the connectors that are inserted at the antenna stations. Unless 
the antenna is at the closest station, there are one or more interconnecting loops, 
where unused stations are bypassed, between the antenna and the master oscillator. 
If there are n connectors in the cable, there are N = n(n—1)/2 pairs between which 
reflections can occur. Also, if the phasors of the corresponding reflected components 
combine randomly, the overall rms error in the phase correction is, from Eq. (7.22), 


Pms = V327 vl ol bv (1 — v)F(«, £), (7.25) 
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Fig. 7.7 The function €210~°/!° plotted against £ for four values of the transmission-line 
attenuation, œ dB m~!. This function is a factor in the round-trip phase error given by Eq. (7.22). 


where 


n 


F(a, £) = 5 5 44 10- 226/10 , (7.26) 


i=] k<i 


the rms value has been used for sin 6,, and the reflection coefficients are all 
approximated by an average magnitude |p]. 

As an example, suppose that an interferometer is designed for observations near 
100 GHz and that it incorporates ten antenna stations in a linear configuration at 
approximately equal increments in distance up to 1 km from the master oscillator. 
The interconnecting oscillator cable carries a reference signal at vı = 2 GHz, and 
for this cable |p| = 0.1, œ = 0.06 dB m7!, v = 2.4x 108 ms™!, and the temperature 
coefficient of electrical length is 107° K~!. From Eq. (7.26), we find that F(a, £) = 
1.1 x 10+. For a temperature variation of 0.1 K in the cable, B = 10~°. If phase 
errors at 100 GHz are required to be less than 1°, 5¢,m; must not exceed 0.02°, and 
from Eq. (7.25), vı and v2 must not differ by more than 1.6 MHz. 
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7.2.4 Automatic Correction System 


An interesting variation on the round-trip scheme, shown in Fig. 7.8, was suggested 
by J. Granlund (National Radio Astronomy Observatory 1967). It is particularly 
suitable for providing a stable reference frequency at a number of points along a 
linear array of antennas. Frequencies vı and v2 are generated by stable oscillators 
and are injected at opposite ends of the transmission line. The difference frequency 
vı — v2 is again very small. At an intermediate station, the two signals are extracted 
by directional couplers and multiplied to form the sum frequency. The phase of this 
sum at the antenna station in Fig. 7.8 is 


Qnv,l\v | +2nv,(L—£,)u | = 2rv Lv! —2n(vy;-—w)(L—-£;)0 1. (1.27) 


For two points at positions £; and £ on the line, the difference in the sum- 
frequency phases is 


Ad = 27 (vi — v2) (£1 — b2 )v™! . (7.28) 


This difference would be zero if vı and v2 were equal, but it is necessary to 
maintain a finite difference frequency because the directivity of the couplers alone 
is seldom sufficient to separate the two signals adequately. The effect of the line 
length variation is not measured explicitly in this case, but the correction occurs 
automatically, except for the small term in Eq. (7.28). Reflections in the cable can 
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Fig. 7.8 Scheme proposed by J. Granlund (National Radio Astronomy Observatory 1967) for 
establishing a reference signal at frequency v; + v2 at various stations along a transmission line. 
One such antenna station is shown. 
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produce errors, as described for the previous scheme, and may be the limiting 
consideration for the frequency offset. A practical implementation of the scheme 
of Fig. 7.8 is described by Little (1969). 


7.2.5 Fiberoptic Transmission of LO Signals 


Optical fiber can replace cables and transmission lines in most of the LO schemes 
discussed above. Some features of optical fiber transmission that should be taken 
into account are outlined below. 


Different optical wavelengths can be used in the two directions of a round-trip 
system to help separate the signals. At the antenna, the frequency of the laser 
signal from the master LO can be offset by a few tens of megahertz by using 
a special modulating device, and injected into the line in the return direction. 
Alternately, a different laser can be used for the return signal. It is important 
to take into account the effects of the fiber dispersion and temperature-induced 
changes in the laser wavelengths, particularly in the case in which two different 
lasers are used. However, if the laser wavelengths are chosen to be very close to 
the zero-dispersion wavelength of the fiber, the resulting errors can be minimized. 
As mentioned in Sect. 7.1, the performance of optical components such as isola- 
tors and directional couplers is much better than that of corresponding microwave 
components. With careful design, it is possible to use such components to 
separate signals at the same laser wavelength traveling in opposite directions 
in a fiber. Round-trip phase systems have been made in which a radio frequency 
signal is transmitted on an optical carrier, and at the receiving end, a half-silvered 
mirror is used to return a component of the signal back along the fiber for a 
round-trip measurement. It may be necessary to use an optical isolator at the 
transmitting end to ensure that any of the returned signal that reaches the laser 
is very small. Reflection of a laser signal back into the output can disturb the 
operation of the laser. 

In general, when a multifiber cable is flexed, the effective lengths of the 
individual fibers vary smoothly and remain matched to a much greater degree 
than is the case for bundled coaxial cables. As a result, it may be possible to 
use two separate fibers for the two different directions in a round-trip scheme, 
depending on the accuracy required. 

Twisting of a straight fiber that is held under constant tension has been found 
to cause less change in the electrical length than bending of a fiber. Twisting, 
however, can result in small changes in the amplitude of the transmitted signal, 
resulting from the residual sensitivity of the optical receiver to the angle of the 
linear polarization of the light. 

It is possible to stabilize the length of the path through a fiber by use of round- 
trip phase measurement at the optical wavelength. In practice, this requires the 
use of an automatic correction loop in which a length adjustment device is 
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controlled by the round-trip phase, since length variations comparable to the 
optical wavelength can occur on timescales of much less than one second. 

e An LO frequency can be transmitted as the difference frequency of two optical 
laser signals that travel in the same fiber. The radio frequency is generated by 
combining the optical signals in a photo-optic diode. Radio power of several 
microwatts can be obtained, which is sufficient to provide LO power for an 
SIS mixer. This scheme is particularly attractive for receivers at millimeter and 
submillimeter wavelengths (Payne et al. 1998). 

¢ For standard optical fiber, the temperature coefficient of length is approximately 
7 x 10-° K~!. High-stability fiber, developed by Sumitomo for special appli- 
cations, has a temperature coefficient that is about an order of magnitude less 
and was used in the Submillimeter Array without a round-trip correction system 
(Moran 1998). 


7.2.6 Phase-Locked Loops and Reference Frequencies 


Some practical points in the implementation of LO systems should be briefly 
mentioned. In two of the schemes described above, an oscillator at the antenna 
is controlled by a phase-locked loop. Details of the design of phase-locked loops 
are given, for example, by Gardner (1979), and here we mention only the choice 
of the natural frequency of the loop. Unless the natural frequency is about an 
order of magnitude less than the frequency at the inputs of the phase detector, the 
loop response may be fast enough to introduce undesirable phase modulation at 
the phase detector frequency. In the system in Fig.7.5, the frequency of the input 
signals to the phase detector is the offset frequency vı — v2, an upper limit on 
which has been placed by consideration of the reflections in the line. Also, the 
bandwidth of the noise to which the loop responds is proportional to the natural 
frequency. These considerations place an upper limit on the natural frequency of 
the loop, which in turn limits the choice of the oscillator to be locked. An oscillator 
with inherently poor phase stability (when unlocked) requires a loop with a higher 
natural frequency than does a more stable oscillator. Crystal-controlled oscillators 
are highly stable and require loop natural frequencies of only a few hertz. They are 
especially suitable for long transmission lines because the noise bandwidth of the 
loop is correspondingly small. With crystal-controlled oscillators at the antennas, it 
is possible to send out the reference frequency in bursts, rather than continuously. 
Signals traveling in opposite directions can then be separated by time multiplexing, 
and no frequency offset is required. However, the change in impedance of the 
circuits at the ends of the cable when the direction of the signal is reversed could 
become a limiting factor in the accuracy of the round-trip phase measurement. 
Systems of this type have been designed for several large arrays (Thompson et al. 
1980; Davies et al. 1980). 

In addition to the establishment of a phase-locked oscillator at each antenna 
at a reference frequency (equal to v in Fig.7.4, vz in Fig.7.5, and vj + v2 in 
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Fig. 7.9 Scheme for generating a comb spectrum of harmonics of a frequency v, in which phase 
changes in the harmonic generator are eliminated by enclosing it within a phase-locked loop. The 
filter passes two harmonics that combine in the mixer diode to generate a signal at frequency v. 


Fig. 7.8), it is necessary to generate the multiples or submultiples of this frequency 
that are required for frequency conversions of the received signal. In frequency 
multiplication, phase variations increase in proportion to the frequency. Within the 
multiplier chain from the frequency standard to the first LO frequency, the choice of 
frequency that is transmitted from the central location to the antenna is generally not 
critical. However, if significant noise is added in the transmission process, it may 
be better to transmit a high frequency to minimize multiplication of phase errors 
resulting from the added noise. 

Minimization of phase variations in the frequency-multiplication circuit is 
largely a matter of reducing temperature-related effects, and in this regard, the 
scheme depicted in Fig.7.9 is worthy of mention. It may be useful to generate a 
“comb” spectrum consisting of many harmonics that can be used, for example, for 
tuning in discrete frequency intervals. This can be done by applying the fundamental 
frequency to a varactor diode, but the voltage at which the varactor goes into 
conduction varies with temperature, so the phase of the waveform at which it starts 
to conduct during each cycle varies. This causes variation in the phases of the 
harmonics that are generated. In the circuit in Fig. 7.9, the effect of this variation is 
eliminated. The input fundamental waveform at frequency v is not applied directly 
to the harmonic generator but is used to lock an oscillator at frequency v. This 
oscillator drives the harmonic generator. The waveform at the oscillator frequency 
that is compared with the input frequency is taken after the varactor by selecting two 
adjacent harmonics and combining them in a mixer diode. The phase-locked loop 
holds constant the phase of this output waveform relative to the input frequency v 
and adjusts the phase of the oscillator to compensate for a change in time of switch- 
on of the varactor. 

In the case of a connected-element array, low-frequency components of the phase 
noise of the master oscillator cause similar effects in the LO phase at each antenna, 
and therefore their contributions to the relative phase of the signals at the correlator 
input tend to cancel. However, the frequency components of the phase noise suffer 
phase changes as a result of the time delay in the path of the reference signal from 
the master oscillator to each antenna, and also as a result of the time delay of 
the IF signal from the corresponding mixer to the correlator input (including the 
variable delay that compensates for the geometric delay). Thus, the cancellation is 
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important only for frequency components of the phase noise that are low enough 
that differences in these phase changes, from one antenna to another, are small. The 
bandwidths of phase-locked loops in the LO signals can also limit the frequency 
range over which phase noise in the master oscillator is canceled. In practice, 
cancellation of phase noise from the master oscillator is likely to be effective up to a 
frequency in the range of some tens of hertz to a few hundred kilohertz, depending 
upon the parameters of the particular system. 


7.2.7 Phase Stability of Filters 


Tuned filters used for selecting LO frequencies are also a source of temperature- 
related phase variations. The phase response ¢ of a filter changes by approximately 
nz/2 across the 3-dB bandwidth Av, where n is the number of sections (poles). 
Thus, the rate of change of phase with frequency, measured at the center frequency 
Vo, is 


dp S nmki 
dv| — 24v” 


vo 


(7.29) 


where kı is a constant of order unity that depends on the design of the filter. The 
center frequency varies with physical temperature T by 


— = kavo ý (7.30) 


where kə is a constant related to the coefficients of expansion and variation of the 
dielectric constant of the filter. Thus, the rate of variation of phase with temperature 
is given by 


dp _ do 


=o alee (=) (=) (7.31) 


vo 


The factor vọ/ Av is the Q-factor of the filter. The combined constant k,k2 can be 
determined empirically and is typically of order 107° K~! for tubular bandpass 
filters with center frequencies in the range 1 MHz to 1 GHz. Thus, for example, 
if one allows a 1-K temperature variation for such a filter and places an upper limit 
of 0.1° on its contribution to the phase variation, the fractional bandwidth must 
not be less than 1/100, or 5.4%, for a six-pole filter. Filters of narrow fractional 
bandwidth should be used with caution. To pick out a particular frequency from 
a series of closely spaced harmonics, it may be preferable to use a phase-locked 
oscillator rather than a filter. 
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7.2.8 Effect of Phase Errors 


Rapidly varying phase errors, such as those resulting from noise in LO circuits, 
cause a loss in signal amplitude and hence in sensitivity. They may also cause errors 
in the visibility phase, but the effect is small, since fast variations in the visibility 
phase are substantially reduced by the visibility averaging. To determine the loss in 
sensitivity, the signals from two antennas can be represented by V,,e%" and V,e? 
at the correlator inputs, where the ¢ terms are the phase errors for antennas m and 
n. The correlator output is 


r = (VnermO Ve e0) , (7.32) 


where the angle brackets represent the expectation. Then if Ao = [¢,,(t) — n (®)] is 
the phase error, we have 


r = VinV, [(cos Ad) + j(sin A@)] . (7.33) 


If the probability distribution of Ag is an even function with zero mean, which is 
frequently the case, the time average of the sine term has an expectation of zero. 
Then, by using the first two terms of the series expression for a cosine, we obtain a 
result in terms of the rms phase error, Ad¢yms: 


1 
r=fl- 5 Adis . (7.34) 
The cosine approximation is accurate to 1% for values of Ad;ms less than ~ 37°. A 
reduction in sensitivity of 1% occurs for Ad,ms = 8.1°. 


7.3 Frequency Responses of the Signal Channels 


7.3.1 Optimum Response 


The signals in a synthesis array may pass through amplifiers, filters, and mixers 
before being converted to digital form. The characteristics of these components 
can vary with temperature, etc. However, the resulting problems have become less 
serious as improvements in the technology allow digitization to take place at earlier 
stages in the receiving system. Also, for systems with multichannel correlators, 
as used for spectral line observations, gains of the individual channels can be 
adjusted to provide a uniform response across the full receiving band. However, 
it is important to consider the effect of gain variations in analog components since 
low-noise input stages are generally followed by further amplification to increase 
the signal levels to something of order 20 dbm or more before they are digitized. 
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Except in cases in which the astronomical signals cover a wide relative band- 
width, the signal and the receiver noise both have largely flat spectra over the 
width of an IF band, and the broad spectral limits of the signal delivered to the 
digitizer, or, in earlier systems, to the analog correlator, are determined by the 
frequency response of the receiving equipment. If H(v) = |H(v)|e/*™) is the 
voltage—frequency response function, the output from the correlator for antennas 
m and n, resulting from cosmic signals, is proportional to 


. f Am(v)H, v) dv = Re if Am (v)H, (v) a| 
—oo 0 
= Re i Hn] (0) eta : (7.35) 
0 


where we have used the relation in Eq.(A3.6) of Appendix 3.1, H,,H;* being 
Hermitian, and the subscripts denote the antennas. We are concerned here with the 
dependence of the signal-to-noise ratio (SNR) of an observation on the frequency 
responses of the signal channels. In practice, the frequency responses are nonzero 
only within a limited frequency band of width Av. From Eq. (6.42), we can define 
a factor F equal to the SNR relative to that with identical rectangular responses of 
width Av: 


Re i Hn (v)H; (v) a| 
= 0 


F = (7.36) 


3 , 
As i Hn O) Ha(v) 2 dv 
0 


This equation has a maximum value if |H„(v)| and |H,,(v)| are constant across the 
band Av, that is, if the amplitude response is a rectangular function. If, in addition, 
ġ (v) is identical for both antennas, F is equal to unity. Thus, a rectangular passband 
yields the greatest sensitivity within a limited bandwidth. Note that the same integral 
of H,,H* applies to both the real and imaginary parts of a complex correlator, and 
hence it also applies to the modulus of the visibility. 

Of the other ways in which the receiving passband modifies the response of 
a synthesis array, the most important is the smearing of detail in the synthesized 
response, which limits the field of view that can usefully be imaged. This effect has 
been described in Sect. 6.3. For a given sensitivity, a rectangular passband results in 
the least smearing, since it is the most compact in the frequency dimension. 

An exact rectangular passband is only an ideal concept. In practice, the steepness 
of the sides of the passband must be determined by the particular design and the 
number of poles in the response. The response can be made to approximate a rect- 
angular shape more closely as the number of poles increases, with a proportionate 
increase in 0@/0T as shown by Eq. (7.31). To examine the tolerable deviations of the 
actual passband responses, two effects must be considered: the decrease in the SNR, 
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and the introduction of errors in determining gain factors for individual antennas, as 
will be described. 


7.3.2 Tolerances on Variation of the Frequency Response: 
Degradation of Sensitivity 


We first consider the effects on the sensitivity. Equation (7.36) provides a degra- 
dation factor F, which is the SNR with frequency responses H,,(v) and H,,(v), 
expressed as a fraction of that which would be obtained with rectangular passbands 
of width A(v). In constructing a receiving system, the usual goal is to keep the 
passband flat with steep edges, but in practice, effects such as differential attenuation 
and reflections in cables introduce slopes and ripples in the frequency response that 
are not identical from one antenna to another. To examine these effects, F can be 
calculated for an initially rectangular passband with various distortions imposed. 
The distortions considered are the following: 


1. Amplitude slope across the passband, with the logarithm of the amplitude varying 
linearly with frequency. 

2. Sinusoidal amplitude ripple; this could result from a reflection in a transmission 

line. 

. Displacement of the center frequency of the passband. 

. Variation in phase response as a function of frequency. 

5. Delay-setting error, which introduces a component of phase linear with fre- 
quency. 


e Ww 


The first four of these effects apply mainly to signals in analog form and so are 
of most importance in systems of earlier design in which digitization of the signals 
occurred only in the later stages. Expressions for the frequency response involving 
the above effects are given in the first column of Table 7.1. The second column of the 
table gives the signal-to-noise degradation factor F , and subscripts m and n indicate 
parameter values for particular antennas. The expressions in Table 7.1 have been 
used to derive the maximum tolerable passband distortion for each of the effects, 
allowing a loss in sensitivity of no more than 2.5% (F = 0.975). The resulting 
limits on the passband distortion are shown in Table 7.2. A discussion of limits with 
more stringent tolerances associated with the ALMA array is given by D’ Addario 
(2003). 
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7.3 Frequency Responses of the Signal Channels 


Table 7.2 Examples of frequency response tolerances 


Type of variation 
Amplitude slope 
Sinusoidal ripple 


Center-frequency displacement 


Phase variation 


2.5% Degradation in 
Signal-to-noise ratio 
3.5 dB edge-to-edge 
2.9 dB peak-to-peak 
0.05Av 

mn = 12.8° rms 


Criterion 


281 


1% Maximum 

Gain error 

2.7 dB edge-to-edge 
2.0 dB peak-to-peak 
0.007 Av 

mn = 9.1° rms 


0.05/Av 


Delay-setting error 0.12/Av 


7.3.3 Tolerances on Variation of the Frequency Response: 
Gain Errors 


A second effect that sets limits on the deviations of the frequency responses results 
from errors that can be introduced in the calibration procedure. If we omit the noise 
terms, the output of the correlator for an antenna pair can be expressed as 


Finn = Gma Vm > (7.37) 


where V ,mn is the source-dependent complex visibility from which the intensity map 
can be computed, and G,,,, is a gain factor related to the frequency responses of the 
signal channels. We suppose that these responses incorporate the characteristics of 
the antennas and electronics in such a way that Gmn is proportional to the correlator 
output for antenna pair (m,n) when a point source of unit flux density at the field 
center is observed. In practice, the Gmn values may be determined from observations 
of calibration sources for which the visibilities are known. The measured antenna- 
pair gains can be used to correct the correlator output data directly, but there are 
advantages if, instead, they are used to determine (voltage) gain factors g = |gle/* 
for the individual antennas such that 


Gin = Em8 : (7.38) 


Since, in a large array, there are many more correlated antenna pairs than antennas 
[up to na(na — 1)/2 pairs for ng antennas], not all the calibration data need be used. 
This adds important flexibility to the calibration procedure; for example, a source 
resolved at the longest spacings of an array can be used to determine the antenna 
gains from measurements made only at the shorter spacings. The same principle 
leads to adaptive calibration described in Sect. 11.3. 

In general, the factoring in Eq. (7.38) requires that the frequency responses be 
identical for all antennas or differ only by constant multiplicative factors. If this 
requirement is fulfilled, we can assign gain factors 


g= f HOPA : (7.39) 
0 
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In practice, the frequency responses differ, and an approximate solution to Eq. (7.38) 
can be obtained by choosing the g values to minimize 


(Gmn = Engl , (7.40) 


where the summation is taken over all antenna pairs (m,n) for which Gmn can 
be measured by observation of a calibration source. In calibrating subsequent 
observations of unknown sources, g,,g7 is used in place of Gmn in Eq. (7.37) for 
all antenna pairs, whether they are directly calibrated or not. To avoid introducing 
errors with this scheme, the residuals 


Emn = Ginn = Em8 (7.41) 


must be small, which requires that the frequency responses be sufficiently similar. 
Thus, we are concerned here with the deviations of the frequency responses from 
one another rather than from an ideal response. 

By using model responses for groups of antennas—calculating the pair gains, 
the best-fit antenna gains, and the residuals—tolerances on the bandpass distortion 
can be assigned. Pair gains for the various distortions discussed earlier are given in 
the third column of Table 7.1. Table 7.2 shows examples of tolerances. The results 
depend to some extent on the distribution of distortions in the model responses, 
which for the results shown were chosen with the intention of maximizing the 
residuals. The criteria of 2.5% loss in sensitivity and 1% maximum gain error 
shown in Table 7.2 were used during the early operation of the VLA (Thompson 
and D’ Addario 1982). More stringent criteria may be appropriate, depending on the 
sensitivity and dynamic range to be achieved. The acceptable level of gain error for 
any instrument can be determined by making calculations of the response to source 
models with simulated errors of various levels introduced into the model visibility 
data. Bagri and Thompson (1991) give a discussion of the sources and effects of 
gain errors in the VLA. 


7.3.4 Delay and Phase Errors in Single- and Double-Sideband 
Systems 


For an incoming wavefront from a source, the path lengths to different antennas of 
an array are generally unequal. The relative time differences in the wavefront arrival 
at the antennas are referred to as the geometric delays, t,. To compensate for the 
different geometric delays, the signal received at each antenna is subjected to an 
instrumental delay q; that is continuously adjusted so that t, + q; is the same for all 
antennas. Thus, the signals at the correlator inputs are aligned in time with respect 
to a common wavefront incident from the phase reference position. The fringes at 
the correlator output result from the fact that the signals traverse the geometric and 
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the instrumental delays at different frequencies and that the phase shifts resulting 
from the delays vary as the delays themselves change. In an ideal situation, the 
instrumental delays would be continuously adjusted, and if there were no frequency 
changes within the receivers, no fringe oscillations would occur. In practice, the 
situation is rather more complicated. The instrumental delays are inserted after the 
signals have been digitized, and the sample interval tẹ, provides a convenient unit 
for coarse adjustment. For Nyquist sampling, ts = 1/2Av, where Av is the signal 
bandwidth. 


7.3.5 Delay Errors and Tolerances 


In an array with a number of antennas, the delays are adjusted relative to the delay 
for a designated reference antenna. We consider the reference antenna to be the one 
that is the last one encountered by the approaching wavefront, and its instrumental 
delay remains fixed. The delay error for any antenna is the difference between the 
sum of the geometric and instrumental delays for that antenna and for the reference 
antenna. When the delay error becomes as large as +1, /2, the delay is adjusted by 
an increment + T,. Thus, the delay error for a single antenna is uniformly distributed 
over +1,/2. Coarse delay adjustments in units of the digital sampling interval are 
implemented in a FIFO (first-in-first-out) memory. These provide the major part of 
the instrumental delay, but the residual delay errors are large enough that they can 
cause serious loss of sensitivity if not mitigated. In the original VLA system, for 
example, finer steps were provided by an adjustment in the timing of the sampler 
action in steps of Tọ = 1,;/16. The spacing of adjacent samples remains t, except 
when a delay adjustment is made and the sample occurs earlier or later by tọ. When 
the delay error becomes equal to t)/2, the instrumental delay is adjusted by tọ, as 
represented by the staircase function in Fig. 7.10. One can see from Fig. 7.10 that 
the probability distribution of the delay error is uniform within a range 19/2. For 
a pair of antennas, it can usually be assumed that the times of delay adjustment are 
unrelated (in general, the rates of change of the geometric delay will be different 
for each antenna), so the probability distribution of their combined delay errors is 
a triangular function with extreme values of +7, as in Fig. 7.11. The rms value of 
this delay error is: 


To 


To 1/2 
pi p(At)At*dAt 
0 | =—., (7.42) 
| i p(At) dAt v6 
0 


where p(Ar) is the expression for the probability distribution of At in Fig. 7.11. 
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Time 


Fig. 7.10 Adjustment of instrumental delay in steps tọ to compensate for geometric delay. The 
vertical sections of the staircase function indicate change of instrumental delay, and the horizontal 
sections are the time intervals during which the signal is sampled. Over small time intervals, the 
geometric delay can be represented as a linear function of time. Both axes have the dimensions of 
time but, for example, a baseline of 1 km east-west, the maximum slope of the line representing 
space delay is 0.24 ns per second, and the timescales of the two axes differ by a factor of order 10!°. 


7.3.6 Phase Errors and Degradation of Sensitivity 


A delay error Art results in a phase error in a signal equal to 27 Atv where, for 
systems using analog delays, v represents the frequency in the IF band in which the 
delays are inserted. For systems in which the signal passband is Nyquist sampled, 
which is the most usual case, v is a baseband frequency in the range 0 to Av. 
With a spectral correlator, the highest frequency channel within such a band has 
a center frequency that is approximately equal to the high-frequency edge, Av. For 
frequencies in this top channel, the maximum delay error ty for an antenna pair 
results in a phase error of 27 Avt) = (to/t;). Thus, the probability distribution 
of this phase error is a triangular function as in Fig.7.11 with extreme values 
+(t)/t;)z. As shown for the delay error in Eq. (7.42), the rms phase error is the 
maximum value divided by J/6. 

To determine the effect of delay errors on sensitivity, note that for frequency 
v, a delay error Art results in a phase error 27 vAt. Let œ be the size of the fine 
step as a fraction of the coarse step t,. The sensitivity (i.e., the relative response) is 
determined by averaging the cosine of the phase error, weighted by the triangular 
distribution of delay error in Fig. 7.11, 


ATs : 2 
2 f (: — =) cos(2xvAt)dAt = | (7.43) 
0 


ATs OTs TVATs 


This is the sensitivity for a very small bandwidth centered on frequency v, as in the 
case for spectral line observations. For continuum observations, the sensitivity for a 
band extending from NAv to (N + 1)Av, where N is any integer (including zero), 
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p(At) 


1/t 


P(At) = [1 - (At/to)V/% 


—To 0 To 


Fig. 7.11 Probability distribution p(At) of the delay error At for a pair of antennas. tọ is the 
minimum increment of the instrumental compensating delay. The expression shown for p(AT) 
applies to the part of the probability function for which At > 0. 


is obtained by averaging over the baseband response, 


3 2 a : 2 
1 Avans T sin( vets) 2 f2 [| sin(x) 
— — | dv= — —_ | dx. (7.44) 
Av Jo TVATs a Jo TX 
Here, we use Avt, = 5 and have put vat, = x for convenience in numerical 


evaluation of the integral.* For the case in which we use only the coarse delay steps 
(a = 1), Eq. (7.44) is equal to 0.774, so, as noted earlier, the performance with 
coarse delay steps without further mitigation is not acceptable. Some values of rms 
phase error and sensitivity loss averaged across the bandwidth are given in Table 7.3. 

In Table 7.3, sensitivity loss for œ = 1/4 is approaching an acceptable level. 
However, the maximum phase errors are 6 = 2.45 times the rms value, Le., 2.45x 
10.6° = 26° in this case. Depending upon how fast the delay error is changing with 
time, the maximum error will be decreased somewhat in the data averaging after 
cross-correlation. In the (u, v) plane, the rate of change of delay error goes through 
zero as the u-component of the baseline crosses the v axis. Thus, averaged data 
in which the phase error is close to the maximum are to be expected, especially 
for short-baseline configurations. Hence, in considering the acceptable delay errors, 
phase errors should be considered as well as sensitivity loss. The original VLA 
system used t, = 16t9 and a baseband IF response. 


3In the case in which the phase errors are small, it may be convenient to use (cos(¢?)) ~ (1 — 
($7) /2), where @ = 2r væAr and {) indicates the mean value. Then noting that At and v vary 
independently, (7) = (27x)? (At?) (v?}). From Eq. (7.42), (At?) = 1/6, and for a baseband 
response, (v?) = Av?/3. 
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Table 7.3 Values of the loss in signal-to- 
noise ratio (sensitivity) for the full baseband 
response from 0 to Av, as used in continuum 


observations 
a = t/Ts Prims SNR loss 
1/4 10.6° 1.7% 
1/8 5.30° 0.43% 
1/16 2.65° 0.11% 
1/32 1.33° 0.027% 


7.3.7 Other Methods of Mitigation of Delay Errors 


Conceptually, the most straightforward way of keeping the loss in sensitivity 
resulting from delay errors within a tolerable limit* (say, ~ 1%) is to use a small 
enough value for the minimum delay increment. This may not always be easy in 
systems with wide bandwidths, which require correspondingly high sample rates 
in the digitization. A possible scheme to reduce phase errors (D’ Addario 2003) is 
one in which whenever a delay increment is inserted or removed, a phase jump of 
magnitude 27 vot, and opposite sign to the delay-induced phase jump, is inserted 
in the corresponding signal through an LO. Here, vo is the IF center frequency, for 
which the phase error is exactly canceled. The overall effect for the full bandwidth 
can be found by determining the value of ((v — vo)’), that is, the mean squared value 
of frequency measured with respect to the band center: 


1 vot Av/2 A 2 
(ow) == f , 


2 
2 = dv = —. 7.45 
Av o—Av/2 wv vo) » 12 ( ) 


This result applies to any IF band of width Av. Since the phase changes resulting 
from the changes in the instrumental delay provide a component of the frequency 
offset used to stop the interferometer fringes, it is necessary to account for this effect 
by inserting a smooth component in the form of a frequency offset, 27r vo dt, /dt, 
where tT, is the geometric delay. This could be combined with the fringe-rotation 
offset in an LO.° The combination of the inserted phase jumps and the frequency 


“Various effects in an interferometer system limit the sensitivity. There are some large effects, such 
as aperture efficiency and quantization efficiency, and more numerous smaller ones, such as phase 
irregularities in frequency responses, LO noise, timing errors, delay errors, etc. The combined 
effect of the smaller losses can become serious, so for each one, it is reasonable to aim at a fairly 
stringent limit such as the 1% figure suggested here. 

>This method of mitigation of the delay errors was considered but not implemented during the 
early development of the VLA. The original idea is attributed to B. G. Clark. 

®In the case of a double-sideband system, the fringe rotation must be applied to the first LO, but 
the frequency offset required by the phase error reduction scheme must be applied to the second or 
a later LO so that, like the delay-induced phase errors, the offsets are applied with the same sign to 
each sideband component within the IF band. 
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offset provides a sawtooth phase component that, at the band center, exactly cancels 
the phase sawtooth induced by the delay error. If this method were used with no fine 
delay steps, i.e., To = Ts, the loss in sensitivity would be ~ 13%, so a combination 
with some finer steps would be necessary. 


7.3.8 Multichannel (Spectral Line) Correlator Systems 


In multichannel correlators, the input band is divided into many channels, and the 
signals for corresponding channels are cross-correlated. The number of channels 
is usually an integral power of two and commonly 1024 or more. Within any 
channel, the relative variation of frequency is very small. Thus, at any instant, 
the effect of a delay error Art is to introduce a phase error 27 Atv,, where ve is 
the center frequency of the channel. Since the frequency variation is small across 
a single channel, the loss in signal amplitude that occurs in a wide (continuum) 
band, resulting from the frequency variation of the phase error, is avoided. The time 
variation of the delay error results in a varying phase error that can be corrected 
by inserting a phase correction for each channel at the correlator. Thus, with a 
multichannel correlator, it is possible to avoid the need for delay increments finer 
than Ts, so long as the extra processing steps to correct the phase can be incorporated. 
For an individual antenna, the maximum delay error is t,/2 = 1/(2Av), and for the 
highest channel, centered very close to frequency Av, the maximum phase error is 
x. The time for the delay error variation to complete one cycle is a i Z Av I2, 
(Note that the rate of change of delay, Z, is different for each antenna.) This is 
greater than the time for one fringe cycle by a factor of 2 x (signal frequency at the 
antenna)/(signal frequency in the baseband 0 — Av). At any instant, the phase error 
for the signal from an antenna is equal to 27 x (delay error) x (channel frequency 
in the baseband 0 — Av). If the correction is applied at the correlator output, the 
corrections for both antennas of each pair must be included. 

Carlson and Dewdney (2000) describe a multichannel correlator designed to 
handle wide bandwidth signals (the WIDAR system). The signals from the antennas 
are Nyquist sampled, and then the band is divided into a number of channels, 
N.. The Nyquist sample rate appropriate for each channel is equal to the original 
sample rate divided by N., and the sample rates are adjusted to this value at the 
filter outputs. The outputs of the filters then go to separate cross-correlators. In this 
way, the total bandwidth that can be processed is not limited by the capacity of 
a single correlator. A value of Ne = 32 would be sufficient to reduce the loss in 
sensitivity resulting from the delay errors to an acceptably small value. Adjusting 
the phases of the signals at the correlator inputs removes the phase errors resulting 
from delay errors and also provides fringe stopping. Phase adjustment at this point is 
possible because the samples are in complex form, having been through the filtering 
process. Since multichannel correlators give a means of removing channels that are 
contaminated by interference, they are widely used for continuum as well as spectral 
line observations. 
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7.3.9 Double-Sideband Systems 


The considerations up to this point have applied to single-sideband (SSB) systems. 
For double-sideband (DSB) systems, some differences must be considered (Thomp- 
son and D’ Addario 2000). For an SSB system, the main effect of a phase error is 
to cause a rotation of the correlation vector, as indicated in Fig. 6.5a, resulting in 
an error in the correlator output phase, as considered above.’ For a DSB system, 
the delay error causes the components of the correlation vector resulting from the 
two sidebands to rotate in opposite directions in the complex plane, as shown in 
Fig. 6.5b, where the line AB represents the phase angle when the delay error is 
zero. The amplitude of the vector sum of the two components is proportional to 
cos(27 v9 AT), where vo is the IF center frequency, but the phase of the correlation 
is not changed by a variation in the instrumental delay. 

Consider a case in which the geometric delay is varying rapidly enough that the 
delay error changes sign several times during the minimum averaging time at the 
correlator output. For an SSB system (Fig. 6.5a), the phase of the correlation vector 
swings back and forth, following the difference of the error patterns for the two 
antennas (the small arrows indicating variation of the vector phase in Fig. 6.5 reverse 
direction when the sign of the phase error changes). For a DSB system (Fig. 6.5b), 
the phase angles of the vectors representing the two sideband responses move in 
opposite senses. In both the SSB and DSB cases, components of the correlation 
that are normal to the vector time-average (in Fig. 6.5b, the line AB) cancel, and 
the magnitude of the correlation is proportional to the time average of the cosine 
of the phase measured with respect to the mean phase. Over an averaging period 
in which the SSB phase error changes sign, the loss in sensitivity is effectively 
the same for the SSB and DSB systems. Note, however, that in the SSB case, 
the loss in sensitivity occurs in the averaging, whereas in the DSB case, the loss 
occurs immediately in the correlation process. Thus, in the SSB case, there is an 
opportunity to correct for phase errors after cross-correlation, but in the DSB case, 
this is possible only if the sideband responses can be separated. 

If we are considering delay errors that are quasi-constant, or vary only slowly 
with time, the tolerance on the errors is more stringent in the DSB case. Such 
errors were more important in early interferometers with analog delay systems 
using coaxial cable or ultrasonic elements (see, e.g., Coe 1973), which could be 
temperature sensitive and difficult to calibrate accurately. In digital systems, the 
delays are controlled by a highly accurate master clock, and the only significant 


’There is also a relatively small decrease in the amplitude, which results from the variation of the 
phase error with frequency across the IF band and is proportional to sinc(AvAT), where Av is 
the IF bandwidth. This results from the averaging of the varying phase over time. The same sinc 
function appears as part of Eq. (6.9) and is shown by the broken line in Fig. 6.4. 

8To measure the phase of the cross-correlation of both sidebands in combination, it is necessary to 
periodically insert a 2/2 phase shift into the IF signal of one antenna, or not to stop the fringes and 
fit a sine wave to the fringe function. 
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errors result from the incremental nature of the adjustment. With digital delays, the 
effects are in most respects the same for SSB and DSB systems, except that for DSB 
systems, again, post-correlation corrections are possible only if the sidebands can 
be separated. 

In addition to causing a loss in sensitivity, delay-induced phase errors contribute 
to errors in the phase of the measured visibility. In this case, the values after time 
averaging, not the instantaneous values, are critical. The effective averaging time is 
of the order of the time taken for the baseline vector to cross a cell in the simple case 
of cell averaging, discussed in Sect. 5.2.2. In a synthesis array, the compensating 
delay for each antenna is adjusted to equalize the delay relative to some celestial 
reference point as the source moves across the sky. If the antenna spacings are large, 
the delay may change by several increments during most cell crossings, and the 
resulting phase errors are reduced by the data averaging. However, for any pair of 
antennas, the rate of change of the geometric delay, which is proportional to u, goes 
through zero when the baseline vector crosses the v axis. 

In conclusion, the tolerances in Table 7.2 apply to the overall system from the 
antennas to the correlator inputs. Specifications of filters that define the passband 
should include consideration of the temperature effects discussed in Sect. 7.2.7. 
The frequency selectivity of elements in the earlier stages can then be held to 
the minimum required for rejection of interfering signals. There are advantages to 
implementing the filtering that defines the passband digitally, after the sampling, 
instead of in the analog stages [see, e.g., Prabu et al. (2015)]. 


7.4 Polarization Mismatch Errors 


The response of two antennas to an unpolarized source is greatest when the antennas 
are identically polarized. Small variations in the polarization characteristics of 
one antenna relative to another occur as a result of mechanical tolerances. These 
variations lead to errors in the assignment of antenna gains in a manner similar to the 
variations in frequency responses. To examine this effect, we calculate the response 
of two arbitrarily polarized antennas to a randomly polarized source, which is given 
by the term for the Stokes parameter 7, in Eq. (4.29). Definitions of symbols are in 
terms of the polarization ellipse (see Fig. 4.8 and related text). The position angle 
of the major axis is y, the axial ratio is tan y, and subscripts m and n indicate two 
antennas of an array. As an example, we consider antennas with nominally identical 
circular polarization for which we can write Xm = 17/4+Aym and x, = 7/4+AYXn, 
where the A terms represent the deviations of the corresponding parameter from the 
ideal value. The required response is 


Ginn = Go [cos(Win = Wn) cos(A Xm = A Xn) 
+jsin(Yn = Wn) cos(A Xm ag AXn)] : (7.46) 
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Now Ym — Yn and the A terms represent construction tolerances and are all small. 
Thus, we can expand the trigonometric functions and retain only the first- and 
second-order terms. Equation (7.46) then becomes 


1 
Ginn = Go 1 = 2 KA = Wn) Ea (AXm = Ayn)’ | + j(Vin = Wn) . (7.47) 


An analysis similar to the procedure for frequency responses in Sect. 7.3 can be 
made by assigning polarization characteristics to a model group of antennas and 
determining pair gains, best-fit antenna gains, and gain residuals. For simplicity, it 
is assumed that the spread of values is of similar magnitude for the parameters y 
and y. A 1% maximum gain residual then results from a spread of +3.6° in y and 
w. A value of Ay = 3.6° corresponds to an axial ratio of 1.13 for the polarization 
ellipse, and it is not difficult to obtain feeds for which the deviation from circularity 
is within this value near the beam center. A similar analysis for linearly polarized 
antennas gives tolerances of the same order (Thompson 1984). 


7.5 Phase Switching 


7.5.1 Reduction of Response to Spurious Signals 


The technique of phase switching for a two-element interferometer has been 
described in Chap. 1, where it was explained as an early method of obtaining analog 
multiplication of signals. The principle is as indicated in Fig. 1.8. However, in 
later instruments, the power-law detector is replaced by a correlator. Although more 
direct methods of signal multiplication are now used, phase switching is still useful 
to eliminate small offsets in correlator outputs that can result from imperfections 
in circuit operation or from spurious signals. The latter are difficult to eliminate 
entirely in any complicated receiving system, since combinations of harmonics of 
oscillator frequencies that fall within the observing frequency band or any IF band 
may infiltrate the electronics. Such signals, at levels too low to detect by simple test 
procedures, can be strong enough to produce unwanted components in the output. 
For an array of ną antennas, a receiving bandwidth Av, and an observing duration 
tT, signals at the limit of detectability are at a power level of order (nav Avt)™! 
relative to the noise; for example, this gives 75 dB below the noise for ng = 27, 
Av = 50 MHz, and t = 8 h. Similar effects can also be produced by cross 
coupling of small amounts of noise from one IF system to another. Because such 
spurious signals produce components of the visibility that change only slowly with 
time, they show up as spurious detail near the origin of the image. If they enter the 
signal channel at a point that comes after the phase switch, so that they produce a 
component with no switch-frequency variation at the synchronous detector, they can 
generally be reduced by several orders of magnitude by the phase switching. 
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7.5.2 Implementation of Phase Switching 


Consider the problem of phase switching in a multielement array in which the 
products of the signals from all possible pairs of antennas are formed. Phase 
switching can be represented by multiplication of the received signals by periodic 
functions that alternate in time between values of +1 and —1. For the mth and 
nth antennas, let these functions be f,,(f) and fa(t). Synchronous detection of the 
correlator output for these two antennas requires a reference waveform fn (falt), 
and any nonvarying, unswitched components from the multiplier are reduced by a 
factor 


f FORO dt (7.48) 
T JO 


after averaging for a time t. For the periodic waveforms that we are concerned with, 
this factor will be zero if t is a multiple of the minimum period of orthogonality 
Tor for fm(t) and f,(t). In fact, unwanted output components may not be exactly 
constant, because the tracking of the compensating delays introduces slow changes 
in the phases with which the spurious signals are combined. However, the unwanted 
outputs will be strongly reduced by the synchronous detection as long as their 
variation is small over the period Tor. If the orthogonality of the phase-switching 
functions depends on the relative timing of transitions, the timing should be adjusted 
so that the functions are orthogonal at the correlator inputs. Thus, it may be 
necessary to adjust the timing of the switching waveforms at the antennas to 
compensate for the varying instrumental delays inserted as a source moves across 
the sky. 

Implementation of phase switching on an array of na antennas calls for na 
mutually orthogonal, two-state waveforms. Square waves whose frequencies are 
proportional to integral powers of two are orthogonal, with Tor equal to the period of 
the lowest nonzero frequency.’ In phase switching, Tor is equal to the data averaging 
time, which is typically a few seconds but for special cases may be as low as 10 ms. 
The shortest interval between switching transitions Tsw is equal to the half-period 
of the fastest square wave. Technically, it is convenient if Tor/Tsw does not greatly 
exceed about two orders of magnitude. If one antenna remains unswitched, then 
Tor/Tsw = 2"—!. Square waves of the same frequency are orthogonal if their phases 
differ by a quarter of a cycle in time. When this condition for orthogonality is also 
included, Tor/Tsw = 2"+! where n is the smallest integer greater than or equal to 
(na — 3)/2. This reduces the value of Tor/Tsw, but the orthogonality then depends 
upon the relative timing of the transitions at the correlator, which is not the case for 
square waves of different frequencies. In either case, Tor/ Tsw is inconveniently large 
for a large array and, for example, for na = 27, it is of order 108 in the first case and 
10* in the second. 


°Such waveforms are sometimes referred to as Rademacher functions. 
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It is useful to note that a condition for a pair of square waves of different 
frequency to be orthogonal, for arbitrary time shifts, is that they do not contain 
Fourier components of the same frequency. A property of square waves is that 
all even-numbered Fourier components (i.e., even harmonics of the fundamental 
frequency) have zero coefficients, but odd-numbered components have nonzero 
coefficients. Thus, although sinusoids with frequencies proportional to 1, 2, 3,... 
are mutually orthogonal, square waves with such frequencies, in general, are not. 
For example, square waves of frequencies 1, 2, and 4 have no common Fourier 
components and are mutually orthogonal, but 1, 3, and 5 have common components 
and are not mutually orthogonal. D’Addario (2001) shows by generalization of 
this analysis that the lowest frequency sets of N mutually orthogonal square waves 
consist of those with frequencies proportional to 2” for n = 0, 1,...,(N — 1), that 
is, the square-wave sets discussed above. Since the different square waves of a set 
that we are considering contain no common Fourier components, their orthogonality 
is not affected by relative time shifts. Note, also, that exact orthogonality is not 
essential for phase switching. Unwanted responses can be reduced by a factor of 
10* or more by using square waves with k cycles per averaging period for values of 
k that are prime numbers greater than 100. 

For arrays with large numbers of antennas, Walsh functions are generally 
the preferred waveforms for phase switching. Walsh functions are rectangular 
waveforms in which transitions between +1 and —1 occur at intervals that are a 
varying integral submultiple of a basic time cycle, as in Fig. 7.12. For a description 
of Walsh functions (Walsh 1923; also Fowle 1904) see, for example, Harmuth (1969, 
1972) or Beauchamp (1975). Various systems of designating and ordering Walsh 
functions are in use. In one system (Harmuth 1972), those with even symmetry 
are designated as cal(k,t) and those with odd symmetry as sal(k, t). Here, t is 
time expressed as a fraction of the time base T, which is the interval at which 
the waveform repeats, and k is the sequency, which is equal to half the number 


| | Sal(1,4 
| | | | | | | | | | | | | | | | | | Sal(9,0) 
| | | | | | | | | | | | | | | | | | | | | | | | Cal(12,¢) 


Fig. 7.12 Four examples of Walsh functions, each of which repeats after the one cycle of the time 
base interval plotted above. Within this interval, the sal functions are odd, and the cal functions are 
even. The value of each function alternates between | and —1. The first number in parentheses in 
the name of each function is the sequency, which is equal to half the number of zero crossings in 
the time base interval. Time ¢ is measured as a fraction of the time base. 
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of zero crossings within the time base. Walsh functions with different sequencies 
are orthogonal, and cal and sal functions of the same sequency are orthogonal but 
differ only by a time offset. The orthogonality requires that the time bases of the 
individual Walsh functions be aligned in time, so time offsets are not permitted. 
Walsh functions with sequencies that are integral powers of two are square waves. 
If one antenna is unswitched, and if only the cal or only the sal functions are 
used, the highest sequency required is na — 1. Then Tor/Tsw = 2n, where n is the 
smallest power-of-two integer greater than or equal to n, — 1. If both cal and sal 
functions are used, then n is the smallest power-of-two integer greater than or equal 
to (na — 1)/2. For example, for ng = 64, Tor/ Tsw is 128 in the first case and 64 in the 
second. Another designation for Walsh functions, wal(n, t), includes both cal and sal 
functions, cal(n, t) = wal(2n, f) and sal(n, t) = wal(2n — 1, f). 

One method of generating Walsh functions makes use of Hadamard matrices, of 
which the one of lowest order is 


1 1 
H, = : 7.49 
2 f _ (7.49) 
Higher-order Hadamard matrices can be obtained by replacing each element of H3 
by the matrix Hz multiplied by the element replaced [which is equivalent to forming 
an outer product; see Eq. (4.48)]. If this is performed twice, for example, we obtain 


11 1 1 1 1 1 1] ca(0,ð, pal(0, ^ 
1-1 1-1 1—1 1-1] sal(4,®, pal(4, t) 
1 1-1-1 1 1-1-1] sal(2,^, pal(2, t) 
o {1-1-1 1 1-1-1 1| cal(2,0), pal(6,1) 
Ħs= |i i i 1 -1-1-1-1 sal(1, 1), pal(1, 2) om) 
1-1 1-1-1 1-1 1| cal(3,0), pal(5, ^ 
1 1 -1-1-1-1 1 1| cal(1,d), pal(3,a) 
1-1-1 1-1 1 1—1] sal(3,d, pal(7, t) . 


The rows of the matrices correspond to the Walsh functions indicated, the signs 
being reversed for odd sequencies in this particular generation process. The 
waveform required at the phase detector is the product of the phase-switching 
functions at the two antennas involved. The product of two such Walsh functions is 
a Walsh function, the sequency of which is greater than, or equal to, the difference 
between the sequencies of the two original functions. 

Walsh functions can also be generated as products of square-wave functions. 
Square-wave functions are here designated Sq(n, t), where n is an integer and the 
half-period of the square wave is T/2”; that is, there are 2”-' complete cycles within 
the time base, T. The function Sq(0, f) has a constant value of unity. In the examples 
in Fig. 7.12, sal(1, f) is a square-wave function, and cal(3, ft) and sal(9, t) are each 
products of sal(1, ¢) and one other square-wave function. When considering Walsh 
functions as products of square-wave functions, it is convenient to use the Paley 
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designation of Walsh functions, pal(n, t) (Paley 1932). The integer n is called the 
natural order of the Walsh function. A Walsh function pal(n, t), which is the product 
of square-wave functions Sq(i, t), SqQ, t), ..., Sq(m, t), has a natural order number 
n = 25! 4 97-14,...,42""!. The product of two Walsh functions is another 
Walsh function, of which the natural order number is given by modulo-2 addition 
(that is, no-carry addition) of the binary natural order numbers of the component 
Walsh functions. 

Table 7.4 shows the relationship between the natural order numbers for a series 
of Walsh functions and the square-wave functions of which they are composed. The 
product of two Walsh functions can be expressed as the product of the component 
square-wave functions, for example, 


pal(7, t) x pal(10, £) = [Sq(1, £) x Sq(2, t) x Sq(3, ©] x [Sq(2, A x Sq(4, Ð] 
= Sq(1, t) x Sq(2, t) x Sq(2, t) x Sq(3, £) x Sq(4, £) 
= Sq(1, £) x Sq(3, £) x Sq(4, £) 
= pal(13,7) , (7.51) 
where we have used the fact that the product of a Walsh or square-wave function 
with itself is equal to unity. The natural orders of the two Walsh functions, 7 and 10, 


in binary form are 0111 and 1010. The modulo-2 addition of these binary numbers 
is 1101, which is equal to 13, the natural order of the Walsh function product. 


Table 7.4 Square-wave components of some Walsh functions 


Square-wave components 


Natural order Sequency 
designation Sq(0, t) Sq(1, t) Sq(2, t) Sq(3, t) Sq(4, t) designation 
pal(0,r) 1 cal(0,t) 
pal(1,z) 1 sal(1,f) 
pal(2,r) 1 sal(2,t) 
pal(3,r) 1 1 cal(1,t) 
pal(4,r) 1 sal(4,f) 
pal(5,1) 1 1 cal(3,1) 
pal(6,r) 1 1 cal(2,t) 
pal(7,t) 1 1 1 sal(3,t) 
pal(8,r) 1 sal(8,t) 
pal(9,r) 1 1 cal(7,t) 
pal(10,r) 1 1 cal(6,1) 
pal(11,¢) 1 1 1 sal(7,t) 
pal(12,r) 1 1 cal(4,t) 
pal(13,r) 1 1 1 sal(5,f) 
pal(14,t) 1 1 1 sal(6,1) 
pal(15,¢) 1 1 1 1 cal(5,1) 
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The examination of Walsh functions as products of square-wave functions 
leads to a useful insight into the efficiency of Walsh function phase switching in 
eliminating unwanted components (Emerson 1983, 2009). Let U(t) be an unwanted 
response within the receiving system, for example, resulting from cross talk in IF 
signals or from an error in the sampling level of a digitizer. U(t) arises after the 
initial phase switching, so when synchronous detection with the phase-switching 
waveform is performed at a later stage, U(t) becomes U (t)pal(n, t), and this product 
is significantly reduced in the subsequent averaging. Suppose that pal(n, t) is the 
product of m square-wave functions, Sq(1, t), Sq(2, t), ..., Sq(m, t). We can consider 
multiplying U(t) by pal(n, t) as equivalent to multiplying by each of the square- 
wave components in turn. Also, we assume that the period of the square-wave 
functions is small compared with the timescale of variations of U(t). Then, after 
the first multiplication and averaging, the mean residual spurious voltage is 


du 

i= Oa ee =. ee (7.52) 
2 2 dt 2+! dt 

where ét is equal to the half-period of the square-wave function, T/2'. U is 
calculated for one cycle of Sq(i, t), but within the assumption that U(t) is slowly 
varying, Uı can be taken as equal to the average over the Walsh time base T. 
Multiplication by the second square-wave function is obtained by replacing U in 
Eq. (7.52) by U1, which yields 


T dU T euU 


For the m square-wave components, we obtain 


T™” d”uUu 


Unal) = a gm’ 


(7.54) 
so only the higher derivatives of U remain. 

Walsh functions pal(n, t) for which n is an integral power of two are the least 
effective in eliminating unwanted responses, since they are each just a single 
square-wave function. As shown by examination of Table 7.4, those for which 
n = 2k — 1, where k is an integer, contain the largest number of square-wave 
components. In arrays with a small number of antennas, for which a large number of 
different switching functions is not required, it is possible to select Walsh functions 
that are the most effective in reducing unwanted components. Similarly, Walsh 
functions can be more effective than square waves in some applications to single 
antennas, such as beam switching between a source and a reference position on 
the sky (Emerson 1983). Another set of possible phase-switching functions are m- 
sequences, considered by Keto (2000) for cases in which both 90° and 180° phase 
changes are required. 
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7.5.3 Timing Accuracy in Phase Switching 


In designing a phase-switching system, timing tolerances should be considered. 
In general, the accuracy should be much smaller than the minimum interval 
between transitions in any function. For example, in ALMA [the Atacama Large 
Millimeter/submillimeter Array, Wootten (2003), Wootten and Thompson (2009)], 
two phase-switching actions are used, one nested within the other (Emerson 2007). 
For sideband separation, + phase-switching is used, with Walsh functions from a 
128-element set with time base 2.048 s and minimum interval between transitions 
of 16 ms. A second 128-element set with time base 16 ms and minimum interval 
125 us is used for -shift switching. Thus, timing errors must be very small 
compared with 125 us. 

In general, the orthogonality of Walsh functions requires that there be no relative 
time shifts between the functions. The first switching occurs at the antenna location, 
that is, as early in the signal path as possible. Digitization of the received signal 
may occur at the antenna or after transmission of the signal in analog form to a 
central processing location. The major system delays are shown in Fig.7.13, in 
which T, is the geometric delay, Ty is the transmission delay (antenna to processing 
location), and q; is the instrumental compensating delay. Delays in the analog or 
digital circuitry are generally small enough to be neglected with regard to the 
timing of phase switching. There are three main timing requirements in the receiving 
system. 


1. The total delay from the incident wavefront to the correlator input, Te + Ti + Ti, 
must be the same for all antennas to preserve the correlation of the wanted 
signals. This is implemented through adjustment of q;. 

2. The corresponding transitions in the first and second phase switchings should be 
aligned in time so that the phase switching of the wanted signals is precisely 


wavefront at first switching second switching correlator 
reference antenna (antenna input) (central location) input 
t=0 t= Tg t= Tg + Ttr t=tgt Trt Ti 


geometric transmission instrumental 
delay Tg delay Ty delay 7; 


Fig. 7.13 Delays in an array that are large enough to affect the timing of the Walsh functions used 
in phase switching. Here, t is time relative to a signal wavefront at the point where it intercepts 
the delay reference antenna. The second switching is shown after the transmission delay, which 
applies when the signals are transmitted in analog form to the central processing location. When 
digitization occurs at the antenna location, both the first and second switchings can be applied 
before the transmission delay. 
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canceled. For example, if the second phase switching is done at the central 
location, the timing of the second switching should be delayed relative to the 
first one by Ty. 

3. Both the first and second switchings in any signal path should be delayed by the 
geometric delay of the corresponding antenna Tg so that, at the correlator input, 
the switching transitions in the unwanted components are aligned in time from 
one antenna to another. The delay t, varies with time as the antennas track a 
source. 


Requirement 2 above is concerned only with the relative accuracy of switchings 
within the same signal path from one antenna to the correlator. This is the simplest 
case because it is concerned only with offsets in two switchings of the same Walsh 
function. Consider the effect of a small time offset ô in the relative timing of 
the first and second switchings. For each transition, the timing difference causes 
the correlator output voltage to be reversed for a period ô and thereby cancels an 
equivalent interval of the unreversed output. Hence, for each transition, there is an 
effective loss of signal for a period 26. The average fractional loss of sensitivity 
is 2n,6/tp, where n, is the number of transitions within the time base Ty (i.e., 
twice the Walsh sequency). Thus, for a tolerable limit of, e.g., 1% correlation loss, 
the tolerable value of ô can be determined for any given time base and maximum 
sequency used. Since the correlation loss is proportional to n;, use of the lowest 
sequencies within the Walsh set helps to minimize loss in sensitivity. For arrays 
in which the numbers of antennas and the baseline lengths are not too long, the 
delaying of the switchings by t, (as noted in the third requirement above) can 
often be neglected. This introduces a timing error t, that is greatest for the longest 
baselines. The effect of this error can be minimized by using the lowest values of n; 
for the antennas for which the geometric delay is greatest. 

Requirements 1 and 3 are concerned with the relative timing of transitions 
at different antennas, i.e., between different Walsh functions. The effect of a 
timing offset on the rejection of the unwanted components depends on the loss 
in orthogonality of the Walsh functions used for different antennas. This is more 
complicated than the effect of an offset on two identical Walsh functions discussed 
above. The loss in orthogonality depends upon the sequencies of the two functions 
involved and is greatest for sequencies in the middle range of the Walsh set, as 
shown by Emerson (2005). Pairs consisting of a function with an even sequency and 
one with an odd sequency remain orthogonal in the presence of time shifts, but such 
combinations are possible for no more than half of the pairs in a complete Walsh 
set. Of the other pairs, some remain orthogonal with time offsets, as can be shown 
by numerical trials, and some do not [as shown in Fig. 3 of Emerson (2005)]. It is 
clearly beneficial to use equal numbers of odd and even sequencies in an array so 
that for approximately half of the antenna pairs, the orthogonality is independent of 
time offsets. 
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7.5.4 Interaction of Phase Switching with Fringe Rotation 
and Delay Adjustment 


The effectiveness of phase switching in reducing the response to spurious signals 
depends on the point in the signal channel at which these unwanted signals are 
introduced. The three following cases illustrate the most important possibilities. 


1. The unwanted signal enters the antennas or some point in the signal channels that 
is ahead of the phase switching, the fringe rotation, and the compensating delays. 
The unwanted signal then suffers phase switching like the wanted signals and is 
not suppressed in the synchronous detection (although it may be reduced by the 
fringe rotation if the fringe frequency is high, as in the case of VLBI). Externally 
generated interference behaves in this manner, and its effect is discussed in 
Chap. 16. 

2. The unwanted signal enters after the phase switching but before the fringe 
rotation and delay compensation. The fringe-rotation phase shifts, designed to 
reduce to zero the fringe frequencies of the desired signals at the correlator 
output, act on the spurious signal and cause it to appear at the correlator output 
as a component at the natural fringe frequency for a point source at the phase 
reference position. This component then undergoes synchronous detection with a 
Walsh function. If the natural fringe frequency transiently matches the frequency 
of a Fourier component of this Walsh function, a spurious response can occur. 

3. The spurious signal enters after the phase switching and the fringe rotation but 
before the delay compensation. The signal then suffers phase shifts resulting 
from the changing of the compensating delay. The resulting component at the 
correlator output has a frequency equal to the natural fringe frequency that would 
occur if the observing frequency were equal to the IF at which the compensating 
delays are introduced. Thus, the oscillations are one to three orders of magnitude 
lower in frequency than the natural fringe frequency, and it is consequently easier 
to avoid coincidence with the frequency of a component of the Walsh function. 


From these considerations, it is usually advantageous to perform both the phase 
switching and the fringe rotation as early in the signal channel as possible. 
Figure 7.14 shows, as an example, the phase-switching scheme that was used in 
the original VLA system, from a description by Granlund et al. (1978). The phase 
switching at the antenna was performed on an LO, rather than on the full signal band, 
so that a broadband phase switch was not needed. The signals were digitized at the 
output of the final IF amplifier and thereafter were delayed and multiplied digitally. 
In such a system, slow phase drifts that may occur after the phase switching are 
removed by the synchronous detection at the digitizing sampler. The synchronous 
detection could be performed by reversing the sign bit in the digitized signal data and 
needed to be applied only to ng signal channels rather than na(na — 1)/2 correlator 
outputs. 
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Fig. 7.14 Simplified schematic diagram of the receiving channel for one antenna of the original 
VLA system. Walsh functions generated by the computer were periodically fed to digital buffers, 
from which they were clocked out to the phase switch and to sign-reversal circuitry at the digitizing 
sampler. © 1978 IEEE. Reprinted, with permission, from J. Granlund et al. (1978). 


7.6 Automatic Level Control and Gain Calibration 


In most synthesis arrays, automatic level control (ALC) circuits are used to hold 
constant the level of the total signal, that is, the cosmic signal plus the system 
noise, at certain critical points. A fraction of the total signal level is detected, and 
the resulting voltage is compared with a preset value to generate a control signal 
that is fed back to a variable-gain element of the signal chain. Points at which the 
signal level is critical include modulators for transmission of IF signals on optical 
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or microwave carriers and inputs to analog correlators or digital samplers. For a 
discussion of level tolerances in samplers, see Sect. 8.5.1. 

The effect of an ALC loop is to hold constant the quantity |g|?(7s + T4)Av, 
where g is the voltage gain from the antenna output to the point of gain control, Ts 
is the system temperature, and T4 is the component of antenna temperature due to 
the source under observation. Thus, |g|? is made to vary inversely as (Ts + Ta), 
which can change substantially with the antenna pointing angle as a result of 
ground radiation in the sidelobes and atmospheric attenuation. To measure such 
gain changes, a signal from a broadband noise source can be injected at the input 
of the receiving electronics. This noise source is switched on and off, usually at 
a frequency of a few hertz to a few hundred hertz, and the resulting component 
is sampled and monitored using a synchronous detector. When the noise source is 
on, it adds a calibrating component Tc to the overall system temperature, which 
should not be more than a few percent of Ts to avoid degradation of sensitivity. The 
amplitude of the switched component is a direct measure of the system gain, and 
for Ta & Ts, the ratio of the signal levels with the noise source on and off is equal 
to 1 + Tc/Ts, which provides a continuous measure of Ts. This scheme does not 
correct for changes in antenna gain resulting from mechanical deformation, which 
must be calibrated separately by periodic observation of a radio source. 


7.7 Fringe Rotation 


The fringe oscillations in the data from the correlator must be removed before an 
image can be formed. This process is sometimes referred to as fringe stopping 
(i.e., stopping the motion of the fringes with respect to the astronomical sky). As 
described in Chap. 6, this can be achieved by inserting a fringe-frequency offset 
on an LO. For a multiantenna array, the offset for each antenna is chosen to stop 
the fringes for that antenna when combined with a common reference antenna. It is 
also possible to stop the fringes by inserting corrections in the phase of the signals 
at the correlator. If the corrections are inserted before the cross multiplication that 
occurs in the correlation, they can be applied to each of the na antennas of the 
array (see, e.g., Carlson and Dewdney 2000), whereas after cross multiplication, the 
corrections must be applied to all of the n? /2 antenna pairs. However, corrections 
inserted before cross multiplication must be applied to each signal sample that 
goes to the cross-correlator, whereas corrections applied to the cross products can 
be performed after some time-limited averaging of the products. (The averaging 
must not be so long that the fringe oscillations are attenuated.) The effect of time 
averaging on the fringes is to convolve the sinusoidal fringe function of frequency 
ve with a rectangular function of width equal to the averaging time Tay. A 1% loss 
in sensitivity occurs for vfTay = 0.078 and 2% loss for vfTa = 0.111. As an 
approximate criterion, the averaging time should be no more than ~ 1/10 of a fringe 
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period. For a DSB system, the LO offset must be applied to the first LO, or if the 
fringe stopping is applied at the correlator, the sidebands must first be separated. For 
sideband separation, see Sect. 6.1.12. 


Appendix 7.1 Sideband-Separating Mixers 


The principle of the sideband-separating mixer, or image-rejection mixer, is shown 
in Fig. A7.1. The terms cos(27v,,t) and cos(27 vet) represent frequency components 
of the input waveform at the upper- and lower-sideband frequencies, respectively. 
The input is applied to two mixers, for which the LO waveforms at frequency vro are 
in phase quadrature. The mixers generate products of the signal and LO waveforms, 
and the filters pass only the terms of frequency equal to the difference of vio and 
V, or ve. The output from the lower mixer also passes through a 7/2 phase lag 
network. From the resulting terms at points A and B, one can see that by applying 
the waveforms at these points to a summing network, the upper-sideband response is 
obtained. Similarly, by using a differencing network, the lower-sideband response 
is obtained. In either case, the accuracy of the suppression of the response to the 
unwanted sideband depends on the accuracy of the quadrature phase relationships, 
the matching of the frequency responses of the mixers and filters, and the insertion 
loss of the phase lag network. In practice, for conversion from a few gigahertz to 
baseband, suppression to a level of —20 dB is routinely achievable. With careful 
design, suppression to a level approaching —30 dB can be obtained (Archer et al. 
1981). 
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Fig. A7.1 Schematic illustration of the principle of the sideband-separating (image-rejection) 
mixer. The upper-sideband response is obtained from the sum of the outputs A and B, and the 
lower-sideband response from the difference of these outputs. 
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Appendix 7.2 Dispersion in Optical Fiber 


For a frequency component, v,,, of a signal modulated onto an optical carrier, 
Asin(27Vop + $), and transmitted down a fiber, the resulting signal intensity at 
the output of the fiber can be represented by 


A7[1 + mcos(27 vmt)] sin? (27 Voptt + ¢) 


= A’ sin? (27 Voptt +o) 


A?m 
+ 5 sin[27 (Vopt + Vm) (t — At) + ġ] sin(27 vopt + $) 


A? 
+ = sin[27 (vop — Vm)(t-+ At) + 6] sin(Qzv opt +o), (47.1) 


where m is the modulation index. This equation resembles the usual representation 
for amplitude modulation in communications, except that here, the carrier power 
varies linearly with the modulation. Thus, on the left side, the square of the carrier 
expression is used. For the terms of frequency Voy + Vm, the time has been offset by 
+ At to represent the effects of the variation of propagation velocity with frequency. 
At can take both positive and negative values depending on the sign of the dispersion 
D shown in Fig. 7.3. Each term in Eq. (A7.1) is proportional to optical power and, 
thus, also to the modulation amplitude. By applying the identity for the product of 
two sines to each term on the right side of Eq. (A7.1), and ignoring DC and optical 
frequency terms, we obtain for the amplitude at the output of the optical receiver, 


AZ 
= {cos[27 Vn (t + At) — 27 Von At] + cos[27 vn (t — At) — 27 Vop: At]! 
A’m 
= a {cos[2z (vmt — Vopt At)] cos(27 Vm At)} : (A7.2) 


The free-space wavelength corresponding to frequency Vopr is opt, and the wave- 
length difference between frequencies Voy and Vopt + Vm is A tYm/ c (since 


Vm < Vopt). If D is the dispersion and £ is the length of the fiber, At = DLA Ym 76; 
and Vop At = DLlÀoptVm. Thus, the recovered modulation can be written as 


2 
an {cosf2z v(t — DlXopr)] cos(2v7, DE22,,/0)} . (A7.3) 


The phase change induced by Aż at the carrier frequency Vopt appears in the phase of 
the modulation frequency in the first cosine function in Eq. (A7.2). At frequency vm, 
this phase term is equivalent to a time delay DLA, as seen in Eq. (A7.3). This delay 
is much larger than Af and represents the difference between the phase and the group 
velocities in the fiber. The second cosine modifies the amplitude of the modulation 
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component v,,. For example, with dispersion D = 2 ps/(km-nm) (note that this is 
equal to 2 x 10~°s m7”), £ = 50 km, Ao» = 1550 nm, and vm = 10 GHz, we 
obtain At = 8 ps, DA = 155 ns, and the response at frequency v, is reduced by 
1.1 dB relative to the low-frequency end of the modulation spectrum. Note that we 
have assumed above that the frequency spread of the laser results entirely from the 
modulation spectrum, which is justifiable for a high-quality laser with an external 
modulator. Modulation of a diode laser by varying the voltage across it can result in 
unwanted frequency modulation, further spreading the optical spectrum. 


Appendix 7.3 Alias Sampling 


After Nyquist sampling of a signal band nAv to (n + 1)Av, where n is an integer, 
the frequency band of the sampled data is 0 to Av and does not depend on the 
frequency of the band at the sampler input.!° This effect is known as alias sampling. 
To illustrate this situation, consider a Fourier component A sin(27vt + @), with 
arbitrary amplitude and phase, within a band 0 to Av. The band is sampled at the 
Nyquist rate, the sample times being t = m/(2Av) where m = 0,1,2, ... . The 
sampled values of the component are A sin(#), A sin(=> + ¢), A sin( 4 +), .... 
Now consider the case in which the same input band has been converted to the range 
Av to 2Av. The frequency is higher by Av, so the original component becomes 


Asin[27(v + Av)t+ ¢] = 


Asin(2zvt + ġ)cos(2x Avt) + Acos(2mvt + p) sin(2x Avt) . 
(A7.4) 


Again, sampling at times m/(2Av), we obtain for the components: A sin(@), —A sin 

z +o), A sin(4 +), ....The result is the same as before except that the sign 
is reversed for odd values of m. Further investigation shows that this sign reversal 
occurs when n has an odd value. Since the sign reversal occurs for both signals of 
a cross-correlated pair, it has no effect on the product. Thus, for any value of n, the 
result at the correlator output is the same as for a baseband input to the sampler. 
Thus, sampling of the band nAv to (n + 1)Av has the effect of converting the band 


downward by nAv, sometimes referred to as alias sampling. 


Open Access This chapter is licensed under the terms of the Creative Commons Attribution- 
NonCommercial 4.0 International License  (http://creativecommons.org/licenses/by-nc/4.0/), 
which permits any noncommercial use, sharing, adaptation, distribution and reproduction in 
any medium or format, as long as you give appropriate credit to the original author(s) and the 
source, provide a link to the Creative Commons license and indicate if changes were made. 


10This is the case, for example, in both the VLA and ALMA, where a 1 : 2 frequency ratio between 
the lower and upper edges of the final analog IF response is used because it is easier to maintain 
uniform gain than with a baseband response. 
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Chapter 8 
Digital Signal Processing 


The use of digital rather than analog instrumentation offers important practical 
advantages in the transmission of signals over long baselines, the implementation 
of compensating time delays, and the measurement of cross-correlation of signals. 
In digital delay circuits, the accuracy of the delay depends on the accuracy of the 
timing pulses in the system, and long delays accurate to tens of picoseconds are 
more easily achieved digitally than by using analog delay lines. Furthermore, there 
is no distortion of the signal by the digital units other than the calculable effects 
of quantization. In contrast, with an analog system, it is difficult to keep the shape 
of the frequency response within tolerances while delay elements are switched into 
and out of the signal channels. Correlators with wide dynamic range are readily 
implemented digitally, including those with multichannel output, as needed for 
spectral line observations. Analog multichannel correlators employ filter banks to 
divide the signal passband into many narrow channels. Such filters, when subject 
to temperature variations, can be a source of phase instability. Finally, except at the 
highest bit rates (frequencies), digital circuits need less adjustment than analog ones 
and are better suited to replication in large numbers for large arrays. 

Digitization of the signal waveforms requires sampling of the voltages at periodic 
intervals and quantizing the sampled values so that each can be represented by a 
finite number of bits. The number of bits per sample is usually not large, especially 
in cases in which the signal bandwidth is large, requiring high sampling rates. 
However, coarse quantization results in a loss in sensitivity, since modification 
of the signal levels to the quantized values effectively results in the addition of 
a component of “quantization noise.” In most cases, this loss is small and is 
outweighed by the other advantages. In designing digital correlators, there are 
compromises to be made between sensitivity and complexity, and the number of 
quantization levels to use is an important consideration. 

There are two ways to determine the spectrum of a random noise signal, as 
shown in Fig. 8.1. The autocorrelation function of the signal can be measured and 
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Fig. 8.1 The relationship 
between two random 
processes, x(t) and y(t), of 
duration T, and their 
cross-correlation function, 
R,y(t) and the cross 
spectrum, Syy(v). If x(t) and 
y(t) are the same, R,.(T) is 
the autocorrelation function, 
and S,,(v) is the power 
spectrum. The spatial 
counterpart to this diagram is 
shown in Fig. 5.5. 


then Fourier transformed into a power spectrum after a specified integration period. 
Alternately, the signal can be Fourier transformed first and the square modulus 
taken. In the first case, the resolution of the spectral estimate is approximately the 
reciprocal of the number of lags of the autocorrelation function calculated. In the 
direct Fourier transform route, the data stream must be segmented to control the 
spectral resolution, i.e., the resolution is approximately the reciprocal of the data 
segment length. The power spectra from all of the segments are summed over the 
integration period. To compare results between these methods, the number of lags in 
the correlator is set equal to the number of segment samples. For interferometry, the 
same two methods can be applied. The cross-correlation function can be calculated 
and Fourier transformed into a cross spectrum (called the XF technique), or the 
direct Fourier transform of one can be multiplied by the conjugate of the other to 
form the cross spectrum (the FX technique). These two methods are explored in 
detail in this chapter. 

Digital signal processing in radio astronomy began in the early 1960s when 
Weinreb (1963) built a digital 64-channel autocorrelator that operated on the signal 
sampled at the Nyquist rate and quantized with one bit per sample.! At that time, the 
modern fast Fourier transform (FFT) algorithm (Cooley and Tukey 1965) was not 
known, although there are historical precedents in the mathematical literature going 
back to Gauss in the early nineteenth century. For the next two decades, virtually all 
spectrometers for single-dish and interferometric applications were based on the 
auto- or cross-correlation approach. By the 1990s and the advent of very large 
spectral processing systems (in terms of frequency channels and baselines), the 
advantages of the FX approach became apparent. All modern interferometers have 
spectral analysis capabilities, not only for observations of spectral lines but also for 
mitigation of the effects of radio frequency interference (RFI) and of instrumental 
bandwidth smearing. 


1A similar device was used by Goldstein (1962) to detect radar echoes from Venus. 
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The bivariate normal probability function is central to all signal analysis. If x and y 
are joint Gaussian random variables with zero mean and variance o”, the probability 
that one variable is between x and x + dx and, simultaneously, the other is between 
y and y + dy is p(x, y)dx dy, where 
1 —(x? + y? — 2pxy) 
p(x, y) = ——— exp eeprom ; (8.1) 
27x0? y1 — p? 20? (1 — p°) 
and p is the correlation coefficient equal to (xy) / y (x?) (y2), where ( } denotes the 
expectation, which, with the usual assumption of ergodicity, is approximated by the 
average over many samples. The form of this function is shown in Fig. 8.2. Note 
that —1 < |p| < 1. For |p| <« 1, the exponential can be expanded, giving 


1 —x? 1 —y? pxy 
,y) > | ——exp| — —— exp | — 1+2), 8.2 
pay) |- = 163) |- = (35) ( = (8.2) 
which for p = 0 is simply the product of two Gaussian functions. Equation (8.1) 
can also be written as 


(x es ee a (8.3) 
ei — osn P\ 20? o\/2n(1 — p?) j 20?(1 =p?) ] ` l 


Fig. 8.2 Contours of equal N ney. y Pi 
probability density from the i 
bivariate Gaussian 
distribution in Eq. (8.1). The 
contours are given by 

X + y? — 2pxy = const. For 
p = 0, they become circles; 
for p = 1, they merge into the 
line x = y; and for p = —1, 
they merge into x = —y. 
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If this expression is integrated with respect to y from —oo to +00, it reduces to 
a Gaussian function in x. As p approaches unity, Eq. (8.3) becomes the product 
of a Gaussian in x and a Gaussian in y — x; the latter has a standard deviation 
a / 1 — p?, which tends to zero as p approaches 1. Equations (8.1) and (8.2) will 
be used in examining the response of various types of samplers and correlators. 
For autocorrelators used with single antennas, the quantity to be measured is the 
autocorrelation function R(t) = (v(t)u(t— T)), where v is the received signal. This 
case can be treated with x = v(t) and y = v(t — T). 


8.2 Periodic Sampling 


8.2.1 Nyquist Rate 


If the signal is bandlimited, that is, its power spectrum is nonzero only within a 
finite band of frequencies, no information is lost in the sampling process as long 
as the sampling rate is high enough. This follows from the sampling theorem 
discussed in Sect. 5.2.1. Here, we sample a function of time and must avoid aliasing 
in the frequency domain. For a baseband (lowpass) rectangular spectrum with an 
upper cutoff frequency Av, the width of the frequency spectrum, including negative 
frequencies, is 2Av. The function is fully specified by samples spaced in time 
with an interval no greater than 1/(2Av), that is, a sampling frequency of 2Av 
or greater. This critical sampling frequency, 2 Av, is called the Nyquist rate? for the 
waveform. For further discussion, see, for example, Bracewell (2000) or Oppenheim 
and Schafer (2009). In some digital systems in radio astronomy, the waveform 
that is digitized has a baseband spectrum and is sampled at the Nyquist rate. For 
a rectangular passband of this type, the autocorrelation function, which by the 
Wiener-Khinchin relation is the Fourier transform of the power spectrum, is 


sin(27 Av T) 


Rælt) = 2a Avt 


, (8.4) 
where the subscript œo indicates unquantized sampling (that is, the accuracy is not 
limited by a finite number of quantization levels). Nyquist sampling can also be 
applied to bandpass spectra, and if the spectrum is nonzero only within a range of 
nAv to (n + 1)Av, where n is an integer, the Nyquist rate is again 2Av. Thus, for 
sampling at the Nyquist rate, the lower and upper bounds of the spectral band must 
be integral multiples of the bandwidth. The autocorrelation function of a signal that 
has a flat spectrum over such a band is 


sin(z Av T) 


Roo(t) = ioe [27 (n+ 4) Avr]. (8.5) 


2Shannon (1949) cites several references relevant to the development of this result, of which the 
earliest is Nyquist (1928). 
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Zeros in this function occur at time intervals t that are integral multiples of 
1/(2Av). Therefore, for a rectangular passband, successive samples at the Nyquist 
rate are uncorrelated. Sampling at frequencies greater or less than the Nyquist rate is 
referred to as oversampling or undersampling, respectively. For any signal, adjusting 
the center frequency so that the spectrum conforms to the bandpass sampling 
requirement described above minimizes the sampling rate required to avoid aliasing. 


8.2.2 Correlation of Sampled but Unquantized Waveforms 


We now investigate the response of a hypothetical correlator for which the input 
signals are sampled at the Nyquist rate but are not quantized. It is necessary 
to consider only single-multiplier correlators since complex correlators can be 
implemented as combinations of them, as indicated in Fig. 6.3. The system under 
discussion can be visualized as one in which the samples either remain as analog 
voltages or are encoded with a sufficiently large number of bits that quantization 
errors are negligible. Since no information is lost in sampling, the signal-to-noise 
ratio of the correlation measurement may be expected to be the same as would be 
obtained by applying the waveforms without sampling to an analog correlator. There 
is probably no reason, in practice, to build a correlator for inputs with unquantized 
sampling. However, by comparing the results with those for quantized sampling, 
which we discuss later, the effects of quantization are more easily understood. 

Two bandlimited waveforms, x(t) and y(t), are sampled at the Nyquist rate, 
and for each pair of samples, the multiplier within the correlator produces an 
output proportional to the product of the input amplitudes. The integrator allows 
the output to be averaged for any required time interval. Now the (normalized) 
cross-correlation coefficient of x(t) and y(t) for zero time delay between the two 
waveforms is 


p= oD ; (8.6) 


(xr) (bor) 


(The cross-correlation coefficient p should not be confused with the autocorrelation 


function of x or y, Roo.) Since x and y have equal variance o?, 


(x()y() = po? . (8.7) 


The left side is the averaged product of the two waveforms and thus represents the 
correlator output. The output of the digital correlator after Ny samples is 


Too = Ny DO xyi, (8.8) 
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where the subscript N denotes the Nyquist rate. Since the samples x; and y; obey the 
same Gaussian statistics as the continuous waveforms x(t) and y(t), we can clearly 
write 


(rœ) = po” . (8.9) 


Thus, the output of the correlator is a linear measure of the correlation p. The 
variance of the correlator output is 


o% = (rh) =e). (8.10) 


and 


Ny Nw 


co) = Nw? D > ayar) 


i=1 k=1 


=N Yb xyi + Ny 2 > XiYiXKYk) (8.11) 


i=l ki 


where we have separated the terms for which i = k and i Æ k. The first summation 
on the right side of Eq. (8.11) has a value of o4(1 + 2p°)Ny™!: from Eq. (8.3), it can 
be shown that 


le, ) lo.) 
l f xy p, y)dxdy = ot (1 + 2p’). (8.12) 
—00 —00 


The second summation term in Eq. (8.11) is readily evaluated by using the fourth- 
order moment relation in Eq. (6.36). Because successive samples of each signal are 
uncorrelated (a rectangular passband is assumed), (x;yixeyk)} = (xiyi) (Xkyk), and the 
second summation term has a value of (1 — Ny~')p?o*. Returning to Eq. (8.10), we 
can write 


= (1 +200 Ny! + (1 — Ny ')p?0* — pot 
= o'Ny !(1+ 0°). (8.13) 


The signal-to-noise ratio with unquantized sampling is 


_ (Foo) _ p»/Nn 
i aap 


where the approximation applies for ọ « 1. Note that the condition a <K 1 
is satisfactory in many practical circumstances. For the case in which p 2 0.2, 
see Sect. 8.3.6. (The signal-to-noise ratio at the correlator output, which we are 


(8.14) 


8.2 Periodic Sampling 315 


calculating here, is of interest mainly for weak signals.) For a measurement period 
t, Ny = 2Avt, which is commonly 10°-10!7. From Eq. (8.14), the threshold of 
detectability of a signal is given by p./Ny ~ 1, that is, pọ ~ 1073—1076. In terms 
of the signal bandwidth and measurement duration, Rsnoo = PV2AVT. Now for 
observations of a point source with identical antennas and receivers, p is equal to 
the ratio of the resulting antenna temperature to the system temperature, T4/Ts. 
Thus, the present result is equal to that given by Eq. (6.45) for an analog correlator 
with continuous unsampled inputs and T, < Ts. 

Before leaving the subject of unquantized sampling, we should consider the 
effect of sampling at rates other than the Nyquist rate. Successive sample values 
from any one signal are then no longer independent. We consider a sampling 
frequency that is 8 times the Nyquist rate? and a number of samples N = BNy. The 
sample interval is t; = (2BAv)~!. Samples spaced by gt,, where q is an integer, 
have a correlation coefficient that, from Eq. (8.4), is equal to 


sin(q/B) 
mq/B 


for a rectangular baseband response. Since the samples are not independent, we 
must reconsider the evaluation of the second summation term on the right side of 
Eq. (8.11). For those terms for which q = |i — k| is small enough that Roo(qTs) is 
significant, there will be an additional contribution given by 


Roo (qts) = (8.15) 


[o?Roo(qts)] . (8.16) 


Now R2, is very small for all but a very small fraction of the N(N — 1) terms 
in the second summation in Eq.(8.11). From Eq. (8.15), Res at its maxima, 
is equal to (B/sq)* and for q = 10° is of order 1076. However, as shown 
above, N is likely to be as high as 10°-10!2. Thus, in the second summation in 
Eq. (8.11), the contribution made by the terms for which the i and k samples are 
effectively independent remains essentially unchanged. The products for which R2, 


is significant make an additional contribution equal to 


N-1 [oe] 
20*N~* X(N — q)R (q1) ~ 20°N7! X` R3 (q1;) . (8.17) 
q=1 q=1 


The variance of the correlator output now becomes 


CO 
o% =0'N 11142) R(t) | (8.18) 
q=1 


3B is referred to as the oversampling factor. 
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and the signal-to-noise ratio of the correlation measurement is (see Appendix 8.1) 


wo pv BNN 
yf +2} ge R80 (Gts) 


Compare this result with Eq. (8.14) for Nyquist sampling. For values of 6 of $, L, 
7 and so on, which correspond to undersampling, Roo = 0 and the denominator in 
Eq. (8.19) is unity. The sensitivity thus drops as one would expect from the decrease 
in the data. For oversampling, 6 > 1, and the summation of R2,(qt;) in Eq. (8.19) 
is shown in Appendix 8.1 to be equal to (6 — 1)/2. The denominator in Eq. (8.19) is 
then equal to VB. , So the sensitivity is the same as that for sampling at the Nyquist 
rate. This is as expected, since in Nyquist sampling, no information is lost, and 
thus there is none to be gained by increased sampling. The result is different for 
quantized sampling, as will appear in the following sections. 


(8.19) 


8.3 Sampling with Quantization 


In some sampling schemes, the signal is first quantized and then sampled, and in 
others, it is sampled and then quantized. Ideally, the end result is the same in either 
case, and in analyzing the process, we can choose the order that is most convenient. 
Suppose that a bandlimited signal is first quantized and then sampled. Quantization 
generates new frequency components in the signal waveform, so it is no longer 
bandlimited. If it is sampled at the Nyquist rate corresponding to the unquantized 
waveform, as is the usual practice, some information will be lost, and the sensitivity 
will be less than for unquantized sampling. Also, because quantization is a nonlinear 
operation, we cannot assume that the measured correlation of the quantized 
waveforms will be a linear function of p, which is what we want to measure. 
Thus, to utilize digital signal processing, there are three main points that should be 
investigated: (1) the relation between p and the measured correlation, (2) the loss in 
sensitivity, and (3) the extent to which oversampling can restore the lost sensitivity. 
Investigations of these points can be found in the work of Weinreb (1963), Cole 
(1968), Burns and Yao (1969), Cooper (1970), Hagen and Farley (1973), Bowers 
and Klingler (1974), Jenet and Anderson (1998), and Gwinn (2004). 

Note that in discussing sampling with quantization, it is common practice to 
refer to Nyquist sampling when what is meant is sampling at the Nyquist rate for 
the unquantized waveform. We also follow this usage. 


8.3.1 Two-Level Quantization 


Sampling with two-level (one bit) quantization provided the earliest digital form 
of radio astronomy signals (Weinreb 1963). Although larger numbers of levels are 
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Fig. 8.3 Characteristic curve 
for two-level quantization. 
The abscissa is the input 
voltage x and the ordinate is 
the quantized output x. 


now routinely used, this subsection is included as an introduction to the subject. 
The quantization characteristic for two-level sampling is shown in Fig. 8.3. The 
quantizing action senses only the sign of the instantaneous signal voltage. In 
many samplers, the signal voltage is first amplified and strongly clipped. The zero 
crossings are more sharply defined in the resulting waveform, and errors that might 
occur if the sampling time coincides with a sign reversal are thereby minimized. 

The correlator for two-level signals consists of a multiplying circuit followed by 
a counter that sums the products of the input samples. The input signals are assigned 
values of +1 or —1 to indicate positive or negative signal voltages, and the products 
at the multiplier output thus take values of +1 or —1 for identical or different input 
values, respectively. We consider sampling both at the Nyquist rate and at multiples 
of it and represent by N the number of sample pairs fed to the correlator. The two- 
level correlation coefficient is 


aa i a 


N (8.20) 


where Nj; is the number of products for which both samples have the value +1, 
N,; is the number of products in which the x sample has the value +1 and the y 
sample —1, and so on. The denominator in Eq. (8.20) is equal to the output that 
would occur if, for each sample pair, the signs of the signals were identical. p2 can 
be related to the correlation coefficient p of the unquantized signals through the 
bivariate probability distribution Eq. (8.1), from which 


iy ee 1 [fo] SE aw 
lh ———— r , 
N  2m0o?y1— p Jo Jo 20? (1 — p°) 
(8.21) 


where P4; is the probability of the two unquantized signals being simultaneously 
greater than zero. The other required probabilities are obtained by changing the 
limits of the integrals in Eq. (8.21) as follows: f ee for Pij; Ea j, for Pir; 
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0 
and fY? f_o for Pij. Note that P}, = Py; and P,j = Piu. Thus, 


P = 2(Pi — Py). (8.22) 


The integral in Eq. (8.21) is evaluated in Appendix 8.2, from which we obtain 


1 1 
Pii = 4 + x sin pP. (8.23) 
Similarly, 
1 | rer 
Py= go apie (8.24) 
so 
2 ey 
p2 = —sin p. (8.25) 
ud 


Equation (8.25), known as the Van Vleck relationship,’ allows p to be obtained from 
the measured correlation p2. For small values, p is proportional to p2. 

To determine the signal-to-noise ratio of the correlation measurement, we now 
calculate O, the variance of the correlator output r2: 


o3 = (r3) — (ra)? , (8.26) 


where 
ry =N' Y ĝi. (8.27) 


In this chapter, the circumflex (^) is used to denote quantized signal waveforms. 
Since p2 = (xy), then from Eq. (8.27), (r2) = p2. Thus, r2 is an unbiased estimator 
of p2. The expression for (75) is equivalent to Eq. (8.11) for unquantized waveforms: 


(73) = N? S A +N? DDD RI i. (8.28) 


This result was first derived by J. H. Van Vleck during World War II in a classified report, 
when studying the power spectrum of strongly clipped noise, which was used for electromagnetic 
jamming (Van Vleck 1943). The work was later declassified, and a brief summary of it appeared 
in Vol. 24 of MIT’s Radiation Laboratory Series (Lawson and Uhlenbeck 1950). A fuller account 
was given by Van Vleck and Middleton (1966). 
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The first summation term on the right side of Eq. (8.28) is equal to NT! since the 
products £;ĵ; take values of +1 for two-level sampling. In evaluating the second 
summation term, the situation is similar to that for unquantized sampling. The factor 
o* in Eq. (8.17) is here replaced by the square of the variance of the quantized 
waveform, which is unity for two-level quantization. For all except a small fraction 
of the terms, g = |i—k| is large enough that samples i and k from the same waveform 
are uncorrelated. These terms make a total contribution closely equal to o. Those 
terms for which samples i and k are correlated make an additional contribution 
closely equal to 


[0,0] 
2N—' $ ` R3(q1) « (8.29) 
q=1 


where R(t) is the autocorrelation coefficient for a signal after two-level quantiza- 
tion. Thus, 


[0.0] 
o3 = N! + (1 =No} + 2N $È R3(qts) — o3 (8.30a) 
q=1 
CO 
~N 11142) Rt) | (8.30b) 
q=1 


where we have assumed that p < 1 and also that the term —N7! o2 can be 
neglected, since here we are mostly interested in signals near the threshold of 
detectability. Then the signal-to-noise ratio is 


Ron = a = N (8.31) 


02 myf1+2 021 RT) 


This ratio, relative to that for unquantized sampling at the Nyquist rate given by 
Eq. (8.14), defines an efficiency factor for the quantized correlation process: 


Rsn2 2 VB 


= a = ———— A (8.32) 


Rsnoo xz,/1+2 an R3(qts) 


Here, we have used N = Ny, so we are considering the same observing time 
as in the Nyquist-sampled case but sampling f times as rapidly. Note that Ts is 
correspondingly reduced. 72 is one case of the general quantization efficiency factor, 
no (introduced in Sect. 6.2), where Q is the number of quantization levels. 
Equation (8.25) gives the relationship between the correlation coefficients for a 
pair of signals before and after two-level quantization. This result includes the case 
of autocorrelation in which the two signals differ only because of a delay. Thus, we 
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may write 


2 
R2(qts) = — sin [Roo(qts)] - (8.33) 


Equation (8.15) gives Roo(qt;) for a rectangular baseband signal spectrum sampled 
at B times the Nyquist rate, and Eq. (8.33) becomes 


Te 


Ro(qts) = = sin”! T (8.34) 


R2(qTs) thus has zeros at the same values of gt, that Roo(qTs) does (the principal 


value is taken for the inverse sine function), and for 6 = 1, $, 3» and so on, we 
obtain 
[o.e) 
XO R3(qts) =0. (8.35) 
q=1 


In these cases, the signal-to-noise ratio is a factor of 2/m (= 0.637) times that 
for unquantized sampling at the same rate given in Eq. (8.15). For oversampling 
with B = 2 and B = 3, the corresponding signal-to-noise factors from Eqs. (8.32) 
and (8.34) are 0.744 and 0.773, respectively. Note, however, that the increased bit 
rate used in oversampling could produce a bigger increase in the signal-to-noise 
ratio if used to increase the number of quantization levels. Doubling the bit rate 
could be used to increase the number of levels to four, for which the signal-to-noise 
factor is 0.881 (as derived in Sect. 8.3.3). For a bit rate increase of three, the number 
of levels could be increased to eight, for which the signal-to-noise factor is 0.963. 
Note also that in the calculations given above, there is an implicit dependence on 
the bandpass shape of the signal through the assumption that p2 < 1 for samples 
for which i is not equal to k in Eq. (8.28). For B > 2, a further dependence on the 
bandpass shape enters through the autocorrelation function R2 (qts). 

It has been mentioned that quantization generates additional spectral compo- 
nents. We can compare the power spectra of a signal before and after quantization 
since these spectra are the Fourier transforms of autocorrelation functions that are 
related by Eq. (8.25). Figure 8.4 shows the spectrum, after two-level quantization, of 
noise with an originally rectangular spectrum. A fraction of the original bandlimited 
spectrum is converted into a broad, low-level skirt that dies away very slowly with 
frequency. 


8.3.2 Four-Level Quantization 


The use of two digital bits to represent the amplitude of each sample results in less 
degradation of the signal-to-noise ratio than is obtained with one-bit quantization. 
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Power spectral density 


0 Frequency 


Fig. 8.4 Spectra of rectangular bandpass noise before and after two-level quantization. The 
unquantized spectrum is of lowpass form, as shown by the broken line. The spectrum after 
quantization is shown by the solid curve. The power levels of the two waveforms (represented 
by the areas under the curves) are equal, and the Fourier transforms of their spectra are related by 
Eq. (8.25). 


Consideration of two-bit sampling leads naturally to four-level quantization, the 
performance of which has been investigated by several authors, notably Cooper 
(1970) and Hagen and Farley (1973). The quantization characteristic is shown in 
Fig. 8.5, where the quantization thresholds are —vo, 0, and vo. The four quantization 
states have designated values —n, —1, +1, and +n, where n, which is not necessarily 
an integer, can be chosen to optimize the performance. Products of two samples 
can take the value +1, +n, or +n?. The four-level correlation coefficient p4 can be 
specified by an expression similar to Eq. (8.20) for the two-level case, that is, 


2n2Nin — 2n?Nyz + 4nNi, — 4nNig + 2N1 — 2Nij (8.36) 

= — oo 
(2n?Nnn sh 2N\1)p=1 

where a bar on the subscript indicates a negative sign. The numerator is proportional 
to the correlator output and reduces to the form in the denominator for p = 1, that 
is, when the two input waveforms are identical. The numbers of the various level 
combinations can be derived from the corresponding joint probabilities. Thus, for 
example, 


Nan = NP mn 
=O? +y? — 2pxy) 


N oO Cc 
= ——— exp | ——~——.—— | dx dy , (8.37) 
2074/1 = p? I. I. | 20?(1 — p°) | 
and, as in the two-level case, the other probabilities are obtained by using the 
appropriate limits for the integrals. For the case of p < 1, the approximate form 
of the probability distribution in Eq. (8.2) simplifies the calculation. 
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Fig. 8.5 Characteristic curve for four-level quantization, with weighting factor n for outer levels. 
The abscissa is the unquantized voltage x, and the ordinate is the quantized output x. vo is the 
threshold voltage. 


Although p4 can be evaluated from Eq. (8.36) in the above manner, an alternative 
derivation that provides a more rapid approach to the desired result is used here. 
This approach follows the treatment of Hagen and Farley (1973) and is based on a 
theorem by Price (1958). The form of the theorem that we require is 


(8.38) 


where r4 is the unnormalized correlator output, and £ and ¥ are again the quantized 
versions of the input signals. For four-level sampling, 


A 


= = (n— 1)ô(x + vo) + 26(%) + (n — 1)6(x— vo) , (8.39) 


where 6 is the delta function, and a similar expression can be written for 09/dy. 
Equation (8.39) is the derivative of the function in Fig.8.5. To determine the 
expectation of the product of the two derivatives on the right side of Eq. (8.38), 
the magnitudes of each of the nine terms in the product of the derivatives must 
be multiplied by the probability of occurrence. Thus, for example, the term 
(n — 1)?5(x + vo)ô(y + vo) has a magnitude of (n — 1)? and probability 


1 —2v2 
= 8 |, (8.40) 
2no2,/1 — p? 207(1 + p) 
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By consolidating terms with equal probabilities, we obtain 


beii) 
dp x =I" D [e are py) * OP ep) 
Z _ 
+ 4(n Dex (=e) +2} (8.41) 
and 


= P 1 oe . 2 —v, )] 
A z| 4 [eo(; an ETEN 3 


2 
+ 4(n— 1) exp (p) + 2l dé , 
(8.42) 


where £ is a dummy variable of integration. To obtain the correlation coefficient p4, 
(r4) must be divided by the expectation of the correlator output when the inputs are 
identical four-level waveforms, as in Eq. (8.36): 


(ra) 


O +n? (1-8) a 


p = 


where @ is the probability that the unquantized level lies between + vp, that is, 


1 vo —x2 
= PN = exp (= z) dx = et( 5). (8.44) 


Equations (8.42)-(8.44) provide a relationship between p4 and p that is equivalent 
to the Van Vleck relationship for two-level quantization. 

The choice of values for n and vg is usually made to maximize the signal-to-noise 
ratio for weak signals, which we now derive. For p < 1, Eqs. (8.42) and (8.43) 
reduce to 


Am 1)E + 1}? 


= p, 8.45 
(p4)p<«1 Prio F ea- (8.45) 
where E = exp(—v5 > /207). The variance in the measurement of r4 is 
2 
og = (rq) — (ra)? = (14) — 04 [2 +r(1— ®)| : (8.46) 


The factor [Ø +n? (1 —®)] is the variance of the quantized waveform and here takes 
the place of o? in the corresponding equations for unquantized sampling. Again, we 
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follow the procedure explained for the unquantized case and write 


N 
a>, (3;57) ) + n~* 5i a (XiViADe) - (8.47) 


i=1 ik 


To evaluate the first summation, note that CAAH can take values of 1, n?, or n4, and 
the sum of these values multiplied by their probabilities is equal to [Ø + n?(1—®)}?. 
The contribution of the second summation is 


-ND [S +r- p)|? + 2N! [Ð + n?(1 — &)/? Sa, (8.48) 


q=1 


where the second term represents the effect of oversampling and is similar to 
Eq. (8.17), and Ry is the autocorrelation function after four-level quantization. Thus, 
from Eq. (8.46), we have 


of = N! [b +r- o) 142 Rae , (8.49) 


Since we have assumed p < 1, the 2 term can be neglected, and the signal-to-noise 
ratio for the four-level correlation measurement is 


pa 2 
Rend = (ra) = pl =DE+IPVN ; (8.50) 


04 z [E +n (1 — E) ,/1 + 22 R4(T,) 


The signal-to-noise ratio relative to that for unquantized Nyquist sampling is 
obtained from Eq. (8.14) for N = Ny and is 


_ 2 
n = Rona _ A= DE+IP VR (8.51) 


Rao aj + n2(1 — )],/1 + 2 Xo RiQts) 


For sampling at the Nyquist rate, 8 = 1 and 


M= Rao ADF MUO) | (8.52) 


Values of n4 very close to optimum sensitivity are obtained for n = 3 with vp = 
0.9960, and for n = 4, with v9 = 0.9420: see Table A8.1 in Appendix 8.3. Note 
that the choice of an integer for the value of n simplifies the correlator. For these two 
cases, 74, the signal-to-noise ratio relative to that for unquantized sampling, is equal 
to 0.881 and 0.880, respectively. Curves of the relative sensitivity as a function of 
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Relative sensitivity 
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Fig. 8.6 Signal-to-noise ratio relative to that for unquantized correlation for the four-level system 
and several modifications of it. The abscissa is the quantization threshold vo in units of the rms level 
of the waveforms at the quantizer input. The ordinate is sensitivity (signal-to-noise ratio) relative to 
an unquantized system. The curves are for: (1) full four-level system with n = 2; (2) full four-level 
system with n = 3; (3) full four-level system with n = 4; (4) four-level system with n = 3 and 
low-level products omitted; (5) three-level system. From Cooper (1970). © CSIRO 1970. Published 
by CSIRO Publishing, Melbourne, Victoria, Australia. Reproduced with permission. 


vo/o for n = 2, 3, and 4 are shown in Fig. 8.6. Similar conclusions are derived by 
Hagen and Farley (1973) and Bowers and Klingler (1974). 

Having chosen values for n and vo, we can now return to Eqs. (8.42) and (8.43) 
to examine the relationship of p and p4. Curve 1 of Fig. 8.7 shows a plot of p and p4. 
Extrapolation of a linear relationship with slope chosen to fit low values of p results 
in errors of 1% at p = 0.5, 2% at p = 0.7, and 2.8% at p = 0.8, where the error is a 
percentage of the true value of p. Thus, for many purposes, a linear approximation 
is satisfactory for values of p up to ~ 0.6. This linearity assumption simplifies the 
final step that we require in discussing four-level sampling, namely, calculation of 
the improvement in sensitivity resulting from oversampling. 

The relationship between the autocorrelation function for unquantized noise Rog 
and that for the same waveform after four-level quantization is the same as for the 
corresponding cross-correlation functions in Eq. (8.45), so we can write 


_ A= DE + IP Reo 


tO abel- S) con 


provided that Ry < 0.6. Now Ro as given by Eq. (8.15) fulfills this condition 
for q = 1 with an oversampling factor 6 = 2. For n = 3 and the corresponding 
optimum value of vo, E = 0.6091, B = 0.6806, and Ry = 0.881R.9. For f = 2, we 
use Eqs. (8.15) and (8.53) and Eq. (A8.5) of Appendix 8.1 to evaluate the summation 
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Fig. 8.7 Correlation 1.0 
coefficient p for unquantized 

signals plotted as a function 

of the correlation that would 


be measured after 0.8 
quantization. The curves are 

for: (1) full four-level system 
with n = 3 and vo = 0, or 0.6 


n = 4 and vo = 0.950; (2) 
four-level system with 
low-level products omitted, 
n = 4 and vp = 0.90; (3) 0.4 
three-level system with 
vo = 0.60. From Cooper 
(1970). © CSIRO 1970. 
Published by CSIRO 0.2 
Publishing, Melbourne, 
Victoria, Australia. 
Reproduced with permission. 
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in the denominator of Eq. (8.51), and obtain 74 = 0.935, which is a factor of 1.06 
greater than for 6 = 1. Bowers and Klingler (1974) have pointed out that the 
optimum value of the quantization level vp changes slightly with the oversampling 
factor. However, the optimum values are rather broad (see Fig. 8.6), and the effect 
on the sensitivity is very small. 

In a discussion of two-bit quantization, Cooper (1970) considered the effect of 
omitting certain products in the multiplication process. For example, if all products 
of the two low-level bits are counted as zero instead of +1, the loss in signal-to- 
noise ratio is approximately 1%, as shown in curve 4 of Fig. 8.6. The products to 
be accumulated are then only those counted as +n and +n? in the full four-level 
system described above, and in the modified system, they can be assigned values of 
+1 and +n, respectively, thereby simplifying the counter circuitry of the integrator. 
An even greater simplification can be accomplished by omitting the intermediate- 
level products also and assigning values +1 to the high-level products. This last 
type of modification yields 92% of the sensitivity of a full four-level correlator. We 
shall not analyze the case where only the low-level products are omitted, but we 
note that to derive the correlation coefficient as a function of p, one can express 
the action of the correlator in terms of two different quantization characteristics 
(Hagen and Farley 1973) or else return to Eq. (8.36) and omit the appropriate terms. 
If both the low- and intermediate-level products are omitted, however, the action can 
be described more simply in terms of a new quantization characteristic, known as 
three-level quantization, without arbitrary omission of product terms. 
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8.3.3 Three-Level Quantization 


Three-level quantization has proved to be an important practical technique, and the 
quantization characteristic is shown in Fig. 8.8. In this case, the approach using 
Price’s theorem will again be followed. 

The expressions for the operating characteristics of a three-level correlator can be 
obtained from those in the preceding section by omitting the terms that refer to low- 
and intermediate-level products and adjusting the weighting factors as appropriate. 
Thus, the equivalent derivative needed in Price’s theorem is, 


i = d(x— vo) + ô(x + vo) , (8.54) 


and the expectation of the correlator output (73) is, from Price’s theorem, 


ahela En] 
a zel; OE a(t 2(1 =) d, (8.55) 


where £ is a dummy variable of integration. The normalized correlation coefficient is 


, (8.56) 


P3 = 


Fig. 8.8 Characteristic curve 
for three-level quantization. 
The abscissa is the unquan- 
tized voltage x, and the 
ordinate is the quantized 
output x. vo is the threshold 
voltage. Since the magnitude 
of x takes only one nonzero 
value, it is perfectly general 
to set this value to unity. 
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where @ is given by Eq. (8.44). For p < 1, Eqs. (8.55) and (8.56) yield 


2E? 
= p—_—_.,, 8.57 
(@3)p«1 Pza T) (8.57) 
where E is defined following Eq. (8.45). The variance of r3 is 
lo, 
03 = (7) — (r) =N- 8) ] 142) RU) — 03 | (8.58) 


q=1 


where R3 is the autocorrelation coefficient after three-level quantization. If Z in 
Eq. (8.58) can be neglected, the signal-to-noise ratio relative to a nonquantizing 
correlator is 


2 
Mm = Rsn3 = (r3) — OOO NPR (8.59) 


Rsnoo 03 Rsnoo a(l = ®) 1+2 2i R3(qts) 


For Nyquist sampling, the maximum sensitivity relative to the nonquantizing case 
is obtained with v9 = 0.61200, for which 73 is equal to 0.810 (see curve 5 of 
Fig. 8.6). With this optimized threshold value, ® = 0.4595, E = 0.8292, and we 
can write R3(gt;) = 0.810Roo(qt;), assuming that p is an approximately linear 
function of r3. Then from Eqs. (8.15), (8.59), and Eq.(A8.5), we find that for a 
rectangular baseband spectrum with the oversampling factor 6 = 2, 3 becomes 
0.890, which is a factor of 1.10 greater than for B = 1. 


8.3.4 Quantization Efficiency: Simplified Analysis for Four or 
More Levels 


For quantization into two, three, or four levels, the quantization efficiency, ngo, is 
0.636, 0.810, and 0.881. For more quantization levels, the loss in efficiency resulting 
from the quantization decreases further, and an approximate method of calculating 
the loss (Thompson 1998) can be used, as follows. This is simpler than the more 
accurate method given in Sect. 8.3.3. In either case, the principle is to calculate 
the fractional increase in the variance of a signal that results from the quantization. 
The signal-to-noise ratio at the correlator output is inversely proportional to this 
variance. 

Figure 8.9 shows a piecewise linear approximation of the Gaussian probability 
distribution of a signal from one antenna. This approximation simplifies the analy- 
sis. The intersections with the vertical lines indicate exact values of the Gaussian. 
For eight-level sampling, the quantization thresholds are indicated by the positions 
of the vertical lines between the numbers +3.5 on the abscissa. The horizontal 
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Fig. 8.9 Piecewise linear representation of the Gaussian probability distribution of the amplitude 
of a signal within the receiver. The intersections of the curve with the vertical lines denote exact 
values of the Gaussian. The abscissa is the signal amplitude (voltage) in units of eo, and the 
numbers indicate the values assigned to the levels after quantization. For eight-level sampling 
the quantization thresholds are indicated by the seven vertical lines that lie between —3.5€0 and 
3.5e0 on the abscissa. For signal levels outside the range +4¢0, indicated by the shaded areas, the 
assigned values are 3.5e0. 


spacing between adjacent levels is represented by e€, in units of the (unquantized) 
rms voltage, o, i.e., €g is the spacing between the levels in volts. We consider first 
the case in which the number of levels is even, as in Fig. 8.9. Any one sample 
that falls between the two consecutive thresholds at meo and (m + l)eo will be 
assigned a value (m+ Jeo. The normalized trapezoidal probability distribution for 
the voltage in this segment of the overall probability distribution in Fig. 8.9 can be 
written as 


1 1 
pív) = +f — (r + 5) <| An meo < v < (m+1)eo, (8.60) 


where A,, is the change in probability, over the voltage range meo to (m + l)eo. 
The extra variance that is incurred by quantizing the voltage is 


1 2 (m+1)eo 1 2 
E — (m+ Deo = / E — (r + ;) e| p(v)dv. (8.61) 


If we make the substitution x = v — (m+ ieo, the excess variance becomes 


€o/2 1 
f x E + rán] dx, (8.62) 
—eo/2 €O 
or 
2 fe? 2 1 peo? 
— dx==(—) . 8.63 
€0 Jo fe 3 ( 2 ) ( ) 
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Note that the Am factor does not appear in Eq. (8.63). Hence, the excess variance is 
the same for all voltage bins from —4eo to 4eo0. The fraction of the area under the 
Gaussian probability curve that lies between these levels is 


1 deo 2 1942 4e 
Ts | e™ 20" dy = et( 25) . (8.64) 
—4deo 


Thus, the variance resulting from quantization of the signal samples with amplitudes 


in the range +4eo is 
1 seo? 4e 
-=(= f{ —]. 8.65 
(7) a (5) a 


We shall assume that the quantization error is essentially uncorrelated with the 
unquantized signal. In the extreme case of two-level sampling, the quantization error 
is highly correlated with the unquantized signal, so the treatment used here would 
not apply. Consider, however, the case of multilevel quantization, as in Fig. 8.10. 
If the signal voltage is increased steadily, the quantization error decreases from a 
maximum at each quantization threshold to zero when the voltage is equal to the 
midpoint of two thresholds. At each threshold, the quantization error changes sign, 
and the cycle repeats. This behavior greatly reduces any correlation between the 
quantization error and the signal waveform. 

It is also necessary to take account of the effect of counting all signals below 
—4eo as level —3.5€0, and those above +4€o0 as +3.5€0. To make an approximate 
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Fig. 8.10 Examples of quantization characteristics for (left diagram) an even number of levels 
(eight), and (right diagram) an odd number of levels (nine). Units on both axes are equal to €. The 
abscissa is the analog (unquantized) voltage, and the ordinate is the quantized output. The dotted 
curves show the analog level minus the quantized level. Note that for even numbers of levels, the 
thresholds occur at integral values on the abscissa, whereas for odd numbers of levels, they occur 
at values that are an integer plus one-half. 
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estimate of this effect, we divide the range of signal level outside of +4¢€o into 
intervals of width eo. Consider, for example, the interval centered on 6.5e0. The 
probability of the signal falling within this level is equal to the corresponding area 
under the curve, which for the piecewise linear approximation is 


l € 
2/20 
The variance resulting from quantization of the signal within this range is closely 


approximated by [(6.5 — 3.5)€o]’, so the total variance of the quantization error for 
signals outside the range +4e€0 is 


[te 4 get] (8.66) 


eo” 29, 222 
) (m— 3)? ie Pie gg eerie d A (8.67) 
V20 


m=4 


In practice, the summation in (8.67) converges rapidly, and only a few terms are 
needed (i.e., those for me < 3). The quantization error resulting from the truncation 
of the signal values outside the range +4¢o clearly has some degree of correlation 
with the unquantized signal level. However, this is a small effect because the fraction 
of samples for which the signal lies outside +4eø is less than 1.6% for eight- 
level quantization, with € optimized for sensitivity. The percentage decreases as the 
number of quantization levels increases. We shall therefore treat the quantization 
error resulting from the truncation of the signal peaks as uncorrelated with the 
signal, but bear in mind that this assumption may introduce a small uncertainty 
into the calculation. 

The variance of the quantized signal is equal to the variance of the unquantized 
signal (o°) plus the variance of the quantization errors in (8.65) and (8.67), that is, 


1 seo? 4€ en = 22 22 
2 2 | —m*e* /2 —(m+1)*e* /2 
o + -| — ert (<2) + ) m-— 3) [e +e |: 
3 ( 2 ) V2 ~V 20n >í 
(8.68) 


If the variance is the same for both signals at the correlator input, and if the 
correlation of the signals is small (i.e., o « 1), then the signal-to-noise ratio at the 
correlator output is inversely proportional to the variance. Thus, the quantization 
efficiency is 


co 


-1 
3 

€ (m—N +1) p j: aaa , (8.69) 
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Table 8.1 Quantization efficiency and other factors 
for four or more levels 


Number of levels (Q) N € P No 
4 2 1.08 0.03 0.86 
8 4 0.60 0.016 0.960 
9 4 0.55 0.013 0.968 
16 8 0.34 0.006 0.988 
32 16 0.19 0.002 0.996 
256 16 0.03 <0.001 1.000 


Here, the equation has been generalized for 2N levels. For an odd number of levels, 
2N + 1, one of which is centered on zero signal level, the equivalent equation for 
the quantization efficiency is 


1 se\2 (N + Le 
ieo (5) er (AE) 


=i 
e z 2 -(m-}) 2/2 -(m+}) 2/2 
4 me N) [e +e | _ (8.70) 


NQN+1) = 


Results from Eqs. (8.69) and (8.70) are given in Table 8.1. The values of € are those 
that maximize no. The fourth column of the table gives P, which is the fraction of 
samples for which the signal amplitude is greater than + Neo for an even number of 
levels or greater than + (N + 5) €o for an odd number of levels. For eight levels, P 
is the fraction of signal samples that contribute to the variance in (8.67). The values 
of ng calculated here are accurate to about 2% for Q = 4 and to 0.1% for Q = 8 
and higher. 


8.3.5 Quantization Efficiency: Full Analysis, Three or More 
Levels 


This section presents a general analysis of quantized systems for three or more 
levels, [e.g., Thompson et al. (2007)]. Let x represent the voltage of the unquantized 
signal samples, which have a Gaussian probability distribution with variance o°. 
Let x represent the quantized values of x. The difference x — X represents an 
inequality introduced by the quantization. The inequality contains a component that 
is correlated with x, and an uncorrelated component that behaves much like random 
noise. Consider the correlation coefficient between x and x = x — wx, where g is a 
scaling factor. The correlation coefficient is 


(xx’) = (x?) — a (xX) l (8.71) 


I I 
Xrms X rms Xrms X rms 
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Here, the angle brackets () indicate the mean value. If œ = (x?)/(xz), then the 
correlation coefficient is zero, and x’ represents purely random noise. We refer to 
this random component as the quantization noise, equal to x — a,x, where a} = 
(x?) / (xå). Without loss of generality, we take o? = (x) = 1 in this analysis and 
use œ; = 1/(xx). The variance of the quantization noise is 


(P) = (x — a%)?) = (x7) — oxy (xk) + af (8?) = 7 (9?) - 1. (8.72) 


The total variance of the digitized signal is 1 + (4°), and the quantization efficiency 
Ng is equal to the variance of the unquantized signal expressed as a fraction of the 
total variance. Thus, 


nee ee ee 2s (8.73) 
=F ae) A i 


Consider the case for an even number of equally spaced levels, as in the eight-level 
case in Fig. 8.10. When the number of levels is even, it is convenient to define N 
as half the number of levels. We first determine (xx). Note that for each sample 
value, x and X have the same sign, so xx is always positive. Let € represent the 
spacing between adjacent quantization levels. The values of x that fall within the 
quantization level between me and (m + 1)e are assigned values £ = (m+ i)e, and 
their contribution to (xx) is 


1 (m+ le 
N 27 Jme 


The contribution from the level between —me and —(m + 1)e is the same as the 
expression above, so to obtain (x$), we sum the integrals for the positive levels and 
include a factor of two: 


2 N—2 a(m+lyje 1 3 
(xk) = A Sf (m+ r)a 


m=0 


i 1 —x?/2 
+f N— -lexe dx| . (8.75) 
(N-le 2 


The summation term contains one integral for each positive quantization level 
except the highest one. The integral on the lower line covers the highest level and 
the range of x above it, for both of which the assigned value is £ = (N — je. Then, 
performing the integration, Eq. (8.75) reduces to 


7 1 N-1 
(xi) = 2 € ( +5 asen) : (8.76) 


m=1 


1 
(m+ ae dx. (8.74) 
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To evaluate the variance of x, again consider first the contribution from values of x 
that fall between me and (m + 1)e. For this level, the quantized data x all have the 
value (m + e. The variance of x for all values of x within this level is 


1 2 5 1 (m+1)e 273 
m+-—]) e¢— e” ldk; 8.77 
( z) ~V 2n J. ) 


For negative x, we again include a factor of 2, sum over all positive quantization 
levels except the highest, and add a term for the highest level and the range of x 
above it. Thus, the total variance of x is: 


2 N-2 1 (m+ le 5 
(2) =4/— Yim + a ef e™/? dx 
T me 


m=0 


1\? 2 
+(N-=]) 2 ge? dels (8.78) 
2 (N-l)e 


The integrals in Eq.(8.78) can be represented by error functions. Then, using 
Eqs. (8.73), (8.76), and (8.78), we obtain 


N-1 
2/1 222 
lll X —m~*e* /2 
TA ) 
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m=1 
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NON) = (8.79) 


For the cases in which the number of levels is odd, the thresholds of the levels 
occur at values that are an integer plus 4, as in the nine-level case in Fig. 8.10. We 
represent the odd level number by 2N + 1. Consider the values of x that fall within 
the quantization level between (m — i)e and (m + i)e. These are assigned the value 
me, i.e., zero for the level centered on x = 0. For this level, the contribution to 
(xx) is 


(m+4)e 


: f exe? d (8.80) 
= mexe XxX. š 
V 2T J(m—5)e 


Summing over all levels, as in Eq. (8.75), we obtain 


2 N-1 (m+4)e oo P 
(xk) = yf — >| mexe* !* dx +f Nexe™ dx] . 
a (m—3)« (N—5)e 


m=1 


(8.81) 
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Then, as in Eq. (8.78), we determine (%”): 


z | AD (m+4)e ; oo 3 
(7) = 4/— 5 / (meye* ” dx +f Neve !? dx 
( (N—3)e 


m=1 m—3)€ 
(8.82) 
Performing the integration in Eqs. (8.81) and (8.82), from (8.73), we obtain 
N 2 
21S ee-ie 
q m=1 
NQN+1) = (8.83) 


Ee EE) 


m=1 


Equations (8.79) and (8.83) can easily be evaluated numerically and provide values 
of quantization efficiency for any number of equally spaced levels. 

Since no significant approximations were made, the same method can be used 
for cases in which the number of quantization levels is small and consequently the 
quantization noise is relatively large. Values of ng for two, three, and four levels 
can be obtained by considering the effect of the quantization noise at a correlator 
input, following the method used above. In cases such as that in Appendix 8.3, for 
which the assigned values for the levels are chosen to optimize 7g, or for which 
the spacing between the level thresholds is not uniform, the formulas derived here 
cannot be applied directly. However, the same general approach of considering the 
spacings between levels can be used. For three-level quantization, the levels for 
maximum quantization efficiency are +0.6120 (€ = 1.224). Then we have 


E 2 2% 2 
(xx) = 4/— ef xe*!? dx, (8.84) 
T €/2 
i E _2 
(7) =e f ede, (8.85) 
T €/2 


ia) | 2 9—0.612/2 
m= = 12 __ (8.86) 
oan 1- erf (292) 
WA 


and 


Examples of results derived using Eqs. (8.79), (8.83), and (8.86) are shown in 
Table 8.2. In each case, the value of € is chosen to maximize ng. The values of 
No are given to five decimal places to show how they approach 1.0 as the number 
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Table 8.2 Examples of quantization efficiency, 
no, for sampling at the Nyquist rate 


Number of levels (Q) N € No 


3 1 1.224 0.80983 

4 2 0.995 0.88115 

8 4 0.586 0.96256 

9 4 0.534 0.96930 
16 8 0.335 0.98846 
32 16 0.188 0.99651 
256 128 0.0312 0.99991 


of levels increases. However, this is for the case of an ideal rectangular passband, 
which in a practical receiving system may be closely approximated. Figure 8.11 
shows the quantization efficiency ng as a function of the threshold spacing e. 

If the constant voltage spacing between adjacent thresholds for both input and 
output values is not maintained, the individual levels can sometimes be adjusted to 
obtain an improvement in 77g of a few tenths of a percent, decreasing with increasing 
number of levels. The values of yg in Table 8.2 are in agreement with results by 
Jenet and Anderson (1998), who give detailed calculations of performance for two- 
to eight-bit quantization, for both uniform and nonuniform threshold spacing. See 
also Appendix 8.3 for optimization in the case of four-level quantization. 

In recent designs of radio telescopes, the level increment € is frequently chosen so 
that signals at levels much higher than the rms system noise can be accommodated 
within the range of levels of the quantizer. This preserves an essentially linear 
response to interfering signals so that they can be eliminated or mitigated by further 
processing. For example, with 256 levels (8-bit representation) and € = 0.5, we 
find that nọ = 0.9796. The range of +128 levels then corresponds to +640, i.e., 
+36 dB above the system noise, for a ~ 2% sacrifice in signal-to-noise ratio. 


8.3.6 Correlation Estimates for Strong Sources 


The efficiency calculations of the previous sections are based on estimates of 
the correlation from the averaged signal products before or after quantization, 
(xiyi) or (X;¥;), in the limit of small correlation, |o] « 1. Johnson et al. (2013) 
show that when the correlation is small (|p| <« 1) and the signal variances are 
known (as is assumed when setting sampler thresholds), averaged products (x;¥;) 
do provide optimal estimates of the correlation. That is, when the correlation is 
small, no combination of the quantized signals will produce an unbiased estimate of 
correlation that has smaller variance than that of the correlator output, if suitable 
weights are chosen. This result arises from the form of the bivariate Gaussian 
distribution in this limit, which can be written such that the factor including the 
correlation coefficient p includes only terms of the form xy [see, e.g., Eq. (8.2)]. 
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Fig. 8.11 Quantization efficiency as a function of the threshold spacing, €, in units equal to the rms 
amplitude, o. The curves are for 64-level (solid line), 16-level (long-dashed line), 9-level (short- 
dashed line), and 4-level (long-and-short-dashed line). As € becomes very small, the output of the 
quantizer depends mainly on the sign of the input, so the curves meet the ordinate axis at the two- 
level value of ng = 2/7. As € increases, more of the higher (positive and negative) levels contain 
only values in the extended tails of the Gaussian distribution, so the number of levels that make 
a significant contribution to the output decreases, and the curves merge together. The curves for 
even-level numbers move asymptotically to the two-level value, and curves for odd-level numbers 
move toward zero. The working point in each case is chosen to be near the maximum of the curve. 


However, when the correlation is large, alternative estimates of the correlation 
will have lower noise. Thus, in the high correlation regime, it is necessary to revise 
our expression for quantization efficiency. For instance, when the signals are not 
quantized, the optimal estimate of correlation for two zero-mean signals is Pearson’s 
correlation coefficient [e.g., Wall and Jenkins (2012)], 


Ny 
Xœ- Do- 
Ip = i=l ———_ (8.87) 


= N; . . . 
where, x = i >» is ı X; is the sample mean, and the sums in the denominator are 
proportional to the sample variances. The standard error in the estimate of rp, i.e., 
Op, iS 


op = Ny? =p’). (8.88) 


As p approaches unity, op goes to zero. In this limit, the probability function 
p(x, y) given in Eq. (8.1) collapses to a one-dimensional Gaussian distribution along 
the line x = y (see Fig.8.2). When p = 1, that line is perfectly defined by a 
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set of measurements of x; and y;, i.e., there are no deviations from the line, and 
the uncertainty in the estimate of p is zero. Perhaps, surprisingly, the estimate of 
correlation made without the sample means and variances, as in Eq. (8.8), i.e., 


1 
= — ‘Vis 8.89 
Too Ny — XiY ( ) 
has an error when normalized by o° of 
Coo = NP +p?) (8.90) 


[see Eq. (8.13)], which equals o, only for pọ = 0. For two-level quantization, for 
a rectangular passband and 6 = 1, the error on the correlation estimate is [see 
Eq. (8.30a) and Eq. (8.35)] 


o = N P- P). (8.91) 
In the case of large correlation, the Van Vleck relation [Eq. (8.25)] 
T 
p= sin 7O (8.92) 


will require a nonlinear scaling of the error in p2, denoted ozy, which can be written 
= TN? ie 
oxy = Ny? (5) — (sin™! p| a-p. (8.93) 


These errors in correlation for the various cases described above [0,, O99, and ozy 
as well as the error for the case of four-level sampling, derived from formulas by 
Gwinn (2004)] are shown in Fig. 8.12. The interesting result is that the performance 
of the two-level correlator is better than that of the unquantized correlator for 
p > 0.6 and approaches that of the Pearson estimator as p approaches unity. The 
peculiarity in the two-level scenario was noted by Cole (1968) and is related to the 
fact that the sample variance is irrelevant in the two-level quantization estimate. 

Johnson et al. (2013) derive maximum likelihood estimators (MLEs) for the 
unsampled case in which the signal variance is known. Its standard deviation, o,, 
falls slightly below o,. The authors also derive MLEs for various quantization levels 
and show that their performance o,(Q) approaches o; for large values of Q. 


8.4 Further Effects of Quantization 


Various forms of analysis in radio astronomy involve cross-correlation of signals 
from different antennas or autocorrelation of a signal as a function of time. The 
values of the correlation of quantized signals deviate from the true correlation of 
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Fig. 8.12 The correlation noise factor, ny ? times the standard deviation, vs. correlation p. The 
correlation factors for p = 0 are equal to ae The curves labeled unquantized and two-level 
quantization give correlation factors based on the standard signal-product estimate of p [see 
Eqs. (8.90) and (8.93) for unquantized and two-level quantization, respectively]. The factor for 
the Pearson’s r curve, given in Eq. (8.88), is based on the estimator 7,, which involves the sample 
mean and variance. Adapted from Johnson et al. (2013). 


the unquantized signals to an extent that is most serious for two-level sampling, 
and the deviation decreases as the number of levels is increased. Correction for 
this effect requires determination of how the cross-correlation of the quantized 
data, here designated R, is related to the true cross-correlation, p. To examine the 
effect of quantization, we consider the effect of a time offset t on two Gaussian 
waveforms that are otherwise identical. In the case of two-level sampling, the 
required relationship is given by the Van Vleck equation [Eq. (8.25)] and is 


R(t) = = sin p(t). (8.94) 
For more than two quantization levels, the relationship is more complicated, and 
although the nonlinearity of the quantized correlation becomes less serious with 
an increasing number of levels, correction may still be necessary. As very large 
instruments come into operation, it becomes increasingly important to remove the 
responses to strong radio sources in order to study the fainter emission from the 
most distant regions of the Universe. This requires very accurate calibration of the 
received signal strengths. 


8.4.1 Correlation Coefficient for Quantized Data 


Let x and y represent two Gaussianly distributed data streams that differ only by a 
time offset, t. The correlation coefficient, p(t), is equal to (xy) / (x*). The quantized 
values of x and y are identified by circumflex accents, i.e., x and j. The correlation 
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coefficient of the quantized variables is 
R(t) = (35)/(¥) . (8.95) 


To determine p(t) as a function of the correlation coefficient p, we need to consider 
the probabilities of occurrence of the unquantized variables x and y within each 
quantization interval. First, consider the case in which the number of quantization 
intervals is even and equal to 2N. Thus, there are N positive intervals plus M 
negative ones. The mean value of the products of pairs of the quantized values, (xy), 
is obtained by considering each of the 2N x 2N = 4N? possible pairings of the 
levels of £ and ĵ. Only half of these need be calculated, since if the x and y values are 
interchanged, the probability remains the same. The probability of the unquantized 
variables x and y falling within any pair of intervals is given by integration of the 
Gaussian bivariate probability distribution, Eq. (8.1), over the corresponding range 
of x and y. In Eq. (8.1), x and y have variance o and cross-correlation coefficient p. 
Here, we are concerned with samples of x and y taken at the Nyquist interval t,, and 
nis the number of Nyquist intervals between the pairs of samples considered. For a 
rectangular passband of width Av, the correlation coefficient is given by 


oa (8.96) 


TNTs 


To calculate (xy) for each combination of two quantization intervals, the joint 
probability of the required unquantized variables falling within these intervals is 
multiplied by the product of the corresponding values assigned in the quantization 
process. These results are then summed for all the pairs of intervals. Since the 
probability distributions of x and ) are both symmetrical about zero, first consider 
the case in which both of these variables are positive and run from zero to N. As 
noted above, we take the step size to be unity. Let L(i) be the series of N + 1 values 
that define the positive quantization steps, i.e., 0, 1,2,..., (N —1),..., 00. Thus, 
fori = 1 to N, L(i) = i—1, and L(N + 1) = ow. For y, there is an identical series of 
levels represented as L( j). Then the component of (xj) that results from the positive 
ranges of x and y is 


L(i+]) 


N N L(j+1) 
De-a] u-u» f fy reDdadj. 90 
i=1 j=l J 


(i) 


where (i — 1/2) and (j — 1/2) are the values of the digital data assigned to 
the corresponding quantization intervals, and p(x, y) is the Gaussian bivariate 
probability distribution, Eq. (8.1). The case in which both x and y are negative 
provides an equal component of (xy). Thus, the component of (x) for cases in 
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which x and y have the same sign is 


ot, l Ka 
(xy) ~ oh Jiao 2 1/2) 
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LG) (i) 


For the cases in which x and y have opposite signs, one of either (i— 1/2) or (j—1/2) 
is negative, and the sign of either x or y within the exponential function in Eq. (8.98) 
is negative. When the corresponding expression is included (with negative sign since 
the component of (x3) is negative), we obtain 


1 N 
a= Ya- 
(x3) em per a /2) 


N L(+1) pLG+)) —(x? + y? — 2pxy) 
j—1/2 75277 _ p2) 
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—(@? + y? + 2pxy) 


Equation (8.99) shows how (x) is derived using the usual form of the bivariate 
distribution in Eq. (8.1). An equivalent form of the probability distribution of x and 
y in Eq. (8.1) is given by Abramowitz and Stegun (1968, see Eqs. 26.2.1 and 26.3.2), 
which avoids the explicit use of the double integrals. Equation (8.99) can then be 
written as follows: 
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where erfc is the complementary error function (1 — erf). 
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Fig. 8.13 Curves of the 
correlation coefficient of 
quantized data as a function 
of the true correlation (i.e., 
the correlation of the 
unquantized data). The lowest 
(solid) curve is for 2-level 
quantization, and moving 
upward, the curves are for 3 
levels (long dashes), 4 levels 
(long and short dashes), 8 
levels (small dashes) and 16 
levels (solid line). Similar 
curves for three and four 
quantization levels are given 
in Fig. 8.7. 
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To calculate 2°, the Gaussian probability function for a single variable is used, 
taking double the expression for the positive range of x: 


SS Gay Ve (=>) 
2 1/2) [ere ( erfc aa) |: (8.101) 


Thus, for a given value of the time interval between samples, the correlation 
coefficient for the quantized data is as given in Eqs. (8.95), (8.99) or (8.100), 
and (8.101). Note that the ratio (xy) /(x”) is independent of the frequency response 
of the system considered and is based on a Gaussian distribution of the amplitude. 
Figures 8.13 and 8.14 show examples of the relationship between the correlation 
of the quantized signals and the true signal correlation. Both of the figures result 
from the same analysis, but the presentation in Fig. 8.14, in which the correlation of 
the quantized data is shown as a fraction of the true correlation, helps to emphasize 
the nonlinearity in the response. A linear response would appear as a horizontal line 
in Fig. 8.14, and the curves approach this condition as the number of quantization 
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Fig. 8.14 Curves of the correlation coefficient of quantized data, expressed as a fraction of the 
true correlation. These are from the same data as used in Fig. 8.13, but here they are plotted as 
fractions of the true (unquantized) correlation, in which the nonlinearity appears as a deviation 
from a horizontal line. The lowest curve is for 2-level quantization, and moving upward, the curves 
are for 3, 4, 8, and 16 levels. The points at which the curves meet the left vertical axis indicate 
the reduction in correlation resulting from quantization when the signal-to-noise ratio is low, as 
given in Table 8.2. The signal-to-noise ratio increases as the curves move from left to right, and 
the correlation coefficients of both the quantized and unquantized data move toward 1.0 for the 
theoretical case of complete correlation between the two signals. 


levels increases. Except for observations of the strongest sources, the signal-to- 
noise ratio from an individual element of a synthesis array is small. Thus, the 
working point on the curves in Figs. 8.13 and 8.14 is generally near the left side, 
where the linearity for signals from cross-correlated pairs is best. As the number 
of quantization levels increases, the accuracy of the correlation increases. The 
curves provide an indication of the extent to which the quantization affects the 
measurement of cross-correlation of signals with Gaussian amplitude distribution. A 
detailed discussion of the effects of quantization of the signal amplitude is given by 
Benkevitch et al. (2016). This includes the case in which the cross-correlated signals 
have different amplitudes, and the effects of quantization as the cross-correlation of 
the analog waveforms approach unity. 

For ease of computation, the correlation can be expressed as a rational function, 
or similar approximation, of the correlator output: See Appendix 8.3 for four-level 
quantization. For three-level quantization, procedures for determination of the cross- 
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correlation, p, from the correlator output are given by Kulkarni and Heiles (1980) 
and D’ Addario et al. (1984). However, with the continuing increase in computer 
power, larger numbers of levels are generally used. 


8.4.2 Oversampling 


Sampling of signals at the Nyquist rate results in no loss of information, but 
quantization causes a reduction in sensitivity as represented by the quantization 
efficiency. Some of the loss due to quantization can be recovered by oversampling, 
that is, sampling faster than the Nyquist rate. For sampling of random noise with 
an ideal rectangular spectrum of width Av, the time interval between adjacent 
Nyquist samples is 1/(2Av). With Nyquist sampling, the noise within each sample 
is uncorrelated with respect to the noise in any other sample, and when such 
data are combined, the noise combines additively in power. Consider the case of 
oversampling in which the number of samples per second is $ times the Nyquist 
rate. When the sample rate exceeds the Nyquist rate, the samples are no longer 
independent, and for any particular sample, there are components of the noise within 
other samples that are correlated with the noise in the sample considered. [Note, 
however, that for any two samples spaced by $ times the sample interval (i.e., 
spaced at the Nyquist interval), or by an integral multiple of the Nyquist interval, the 
noise is uncorrelated.*] The correlated components of the noise in different samples 
combine additively in voltage, rather than additively in power, as is the case for 
uncorrelated noise. 

To illustrate how the components of noise combine, consider one pair of antennas 
and, for example, just four consecutive samples at the correlator output. Let a1, ao, 
a3, and a4 be these voltages, which are proportional to the product of the voltages 
at the correlator inputs. Then we have for the squared sum of these correlated noise 
voltages, i.e., the total noise power, 


[ay +a: + 43 + as)? z= 


a; + a; + a + ay + 2(ayaz + aya3 + aya4 + a2a3 + a2a4 + a344) . (8.102) 


The autocorrelation coefficient of the quantized signals at the correlator input is 
R(nt;), where n is an integer and q, is the spacing in time between adjacent samples. 
The output of the correlator consists of values that are the product of two input 
samples, so the autocorrelation coefficient of the samples at the correlator output is 


It can be assumed that the noise components of the signals from any two antennas are 
uncorrelated, because noise from the sky background that is received in separate antennas is 
resolved, and generally the antennas are sufficiently far apart that cross talk of instrumental noise 
can be ignored. 
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R?(nt;). The mean noise power is given by the mean of the terms in the right side 
of Eq. (8.102), in which each of the a? terms can be replaced by the mean squared 
noise amplitude (a?), and each of the aman terms by (a?)R?(|n — m|t,). Thus, the 
squared sum of the four noise voltages becomes 


A(a*) + 2(a?)[3R?(t;) + 2R?(2t,) + R2(3t;)] . (8.103) 


If the four noise terms were uncorrelated, i.e., if the R? terms were zero, the noise 
power would be the sum of the individual noise powers, 4(a*). The effect of the 
correlation of the noise is to increase the averaged noise power by a factor equal 
to (8.103) divided by (4a’): 


1 + 2[(3/4)R?(t,) + (1/2)R?(2t,) + (1/4)R?(31,)] . (8.104) 


In the general case, averaging a total of N samples at the correlator output, this factor 
becomes 


1+ | (==) R(t) + (=) R?(2t,) + ==) R (3t) +... 
+ (=) R°[(N — bs] (8.105) 


In practice, in radio astronomy, the rate at which the data are sampled is in the range 
of MHz to GHz. The averaging times are in the range milliseconds to seconds, so N 
is likely to be within the range 10° to 10°. The autocorrelation coefficient decreases 
as the time interval between samples increases, and in practice, R?(nt;) becomes 
very small for nts Z 200 times the Nyquist sample interval. Thus, for the terms 
within the square brackets in Eq. (8.105), those after about the first ~ 2008 can be 
neglected. Since, in most cases, N >> 2008, the squared sum of the noise voltages 
simplifies to 


1 + 2[R (t) + Rt) + RGT) +... = 1425 R(n). (8.106) 


n=1 


Equation (8.106) is the fractional increase in the squared noise voltage (i.e., the 
noise power) that results from the fact that the noise in the samples is no longer 
independent when the data are oversampled. The quantization efficiency ng is equal 
to the quantization efficiency for Nyquist sampling, non, multiplied by VB to take 
account of the increase in the number of samples, but divided by the square root of 
Eq. (8.106) because the noise in different samples is no longer independent. Thus, 
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Table 8.3 Variation of quantization efficiency, ng, with oversampling factor 6 


No. of levels € p=1 p=2 p=4 p=8 B=16 B=32 
2 0.6366 0.744 0.784 0.795 0.798 0.799 
3 1.224 0.8098 0.882 0.912 0.920 0.922 0.923 
4 0.995 0.8812 0.930 0.951 0.958 0.960 0.960 
8 0.586 0.9626 0.980 0.987 0.991 0.991 0.992 
16 0.335 0.9885 0.994 0.996 0.998 0.998 0.998 


noting that ts = 1/(2BAv), we obtain 


> nonVB ; (8.107) 


To illustrate the effect of oversampling, examples of the quantization efficiency no, 
derived using Eqs. (8.95), (8.100), (8.101), and (8.107), are shown in Table 8.3. 
These are for 2-, 3-, 4-, 8-, and 16-level sampling and values of f equal to 1, 2, 4, 
8, 16, and 32. In each case, the value of € used is the one that maximizes ng for 
Nyquist sampling, as given in Thompson et al. (2007).° Note that as £ is increased, 
the improvement gained by each further increase declines, because the correlation 
between adjacent samples increases, and thus, the new information provided by finer 
sampling becomes progressively smaller. 


8.4.3 Quantization Levels and Data Processing 


At this point, it is useful to put into perspective the characteristics of quantization 
schemes, which are summarized in Tables 8.2 and 8.3. It should be remembered 
that the assumption p < 1 was used in determining these values. In considering 
the relative advantages of different quantization schemes, we note first that both 
the quantization efficiency yg and the receiving bandwidth Av may be limited by 
the size and speed of the correlator system. The overall sensitivity is proportional to 
No ~ Av. Consider two conditions. In the first, the observing bandwidth is limited by 
factors other than the capacity of the digital system. This can occur in spectral line 
observing or when the interference-free band is of limited width. The sensitivity 
limitation imposed by the correlator system then involves only the quantization 
efficiency ng in Table 8.2, and the choice of quantization scheme is one between 


In this reference, ø is taken to be unity and €, the size of the quantization steps, is chosen to 
maximize the quantization efficiency. 
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ee no lim; 
Table 8.4 Sensitivity factor YBN; for a correlator-limited system 


Number of 

quantization levels (Q) p=1 p=2 
2 (1 bit) 0.64 0.53 

3 (2 bits) 0.57 0.44 
4 (2 bits) 0.62 0.47 

8 (3 bits) 0.56 0.40 
16 (4 bits) 0.49 0.35 


simplicity and sensitivity. In the second case, the observing bandwidth is set by the 
maximum bit rate that the digital system can handle, as may occur in continuum 
observation in the higher-frequency bands. For a fixed bit rate vp, the sample 
rate is vp/Np, where N, is the number of bits per sample, and the maximum 
signal bandwidth Av is v,/(2BN,), where £ is the oversampling factor. Thus, the 
sensitivity is proportional to ng/./ BNp, and this factor is listed for various systems 
in Table 8.4, in which Np = 1 for Q = 2 and N, = 2 for Q = 3 or 4. Note that 
oversampling always reduces the performance under these conditions. For those 
situations in which the capacity of the correlator is limited by the maximum bit 
rate, the value of 0.64 for Nyquist sampling with two-level quantization results in 
the highest overall performance. Four-level sampling is almost as good, and four 
or more levels would be preferred if the bandwidth is limited, as in spectral line 
observations. 

A three-level x five-level correlator, for which the quantization efficiency ng is 
0.86, was constructed by Bowers et al. (1973) for spectral line imaging with a two- 
element interferometer. 

A further point to be noted is that with an analog correlator, the sin x sin and 
cos x cos products for signals from two antennas provide, in principle, exactly 
the same information. However, with a digital correlator, the quantization noise is 
largely uncorrelated between the sine and cosine components of the signal, so the 
quantization loss can be reduced by generating both products and averaging them. 


8.5 Accuracy in Digital Sampling 


Deviations from ideal performance in practical samplers result in errors that, if not 
corrected for, can limit the accuracy of images synthesized from the data. Once the 
signal is in digital form, however, the rate at which errors are introduced is usually 
negligibly small. 

Two-level samplers, which sense only the sign of the signal voltages, are the 
simplest to construct. The most serious error that is likely to occur is in the 
definition of the zero level, in which a small voltage offset may occur. The effect 
of offsets in the samplers is to produce small offsets of positive or negative polarity 
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in the correlator outputs, which can be largely eliminated by phase switching, as 
described in Sect. 7.5. Alternately, the offsets in the samplers can be measured by 
incorporating counters to compare the numbers of positive and negative samples 
produced. Correction for the offsets can then be applied to the correlator output data 
[see, e.g., Davis (1974)]. 

In samplers with three or more quantization levels, the performance depends on 
the specification of the levels with respect to the rms signal level, ø. An automatic 
level control (ALC) circuit is therefore sometimes used at the sampler input. Errors 
resulting from incorrect signal amplitude become less important as the number of 
quantization levels is increased; with many levels, the signal amplitude becomes 
simply a linear factor in the correlator output. In systems using complex correlators, 
two samplers are usually required for each signal, one at each output of a quadrature 
network. The accuracy of the quadrature network and the relative timing of the two 
sample pulses are also important considerations. 


8.5.1 Tolerances in Digital Sampling Levels 


This section provides an example of the accuracy required in sampling. It is based on 
a study of errors in three-level sampling thresholds by D’ Addario et al. (1984). We 
start by considering the diagram in Fig. 8.15, which shows the sampling thresholds 
for a pair of signals to be correlated. Thresholds vı and —v2 apply to the signal 
waveform x(t) and v3 and —vq to y(t). The Gaussian probability distribution of x and 
y is given by Eq. (8.1), and the correlator output is proportional to this probability 
integrated over the (x, y) plane with the weighting factors +1 and zero indicated in 
the figure. This approach enables one to investigate the effect of deviations of the 
sampler thresholds from the optimum, vp = 0.6120. For three-level sampling, the 


Fig. 8.15 Threshold diagram N y H 
for a correlator, the inputs of \\ WSN Yj YY 
which are three-level Y -i +1 YY 
quantized signals. x and y 
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show the combinations of 
input levels for which the 
output is nonzero. 
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correlator output can be written 


(73(0, p)) = [L(ay, a3, p) + L(a2, a4, p) — L(at, 04, —p) — L(a2, 03, —p)] , 
(8.108) 


where œ; = v;/o, and 


Oo f° 1 —(X? + Y? — 2pXY) 
Lana p) = f Passe SS eer, 
( p Qi Ok 27 EV 1- p j 2(1 = p) 


(8.109) 


Here, X = x/o, Y = y/o, and the integrand in Eq. (8.109) is equivalent to the 
expression in Eq. (8.1) but with the variables measured in units of o. 

D’ Addario et al. (1984) point out that since less than 5% loss in signal-to-noise 
ratio occurs for threshold departures of +40% from optimum, the required accuracy 
of the threshold settings, in practice, depends mainly on the algorithm used to 
correct the result. Suppose that the thresholds are kept close to, but not exactly equal 
to, the optimum value. For the x sampler in Fig. 8.15, the deviations from the ideal 
threshold value @ can be expressed in terms of an even part 


1 
Ag. = 5m + 2) — 00, (8.110) 
and an odd part 
1 
Aor = zm — Q2). (8.111) 


For the y sampler, Agy and Ao, are similarly defined. The A, terms produce gain 
errors. They are equivalent to an error in the level of the signal at the sampler, and 
they have the effect of introducing a multiplicative error in the measured cross- 
correlation. The A, terms produce offset errors in the correlator output and are 
potentially more damaging since such errors can be large compared with the low 
levels of cross-correlation resulting from weak sources. The offset errors, however, 
can be removed with high precision by phase switching. The cancellation of the 
offset results from the sign reversal of the digital samples, or of the correlator output, 
as described in Sect. 7.5. The correlator output of a phase-switched system is of the 
form 


1 
r3s(æ, p) = 5 [r3 (æ, p) — rs(a, —p)] . (8.112) 


If all w values are within +10% of ao, the output is always within 107° (relative 
error) of the output of a correlator with the same gain errors, but no offset errors, in 
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the samplers. Thus, with phase switching, errors of up to ~ 10% in the thresholds 
may be tolerable. Also, corrections can be made for gain errors if the actual 
threshold levels are known. Since the probability density distribution of the signal 
amplitudes can be assumed to be Gaussian, the threshold levels can be determined 
by counting the relative numbers of +1, 0, and —1 outputs from each sampler. When 
p is small (a few percent), a simple correction for the gain error can be obtained by 
dividing the correlator output by the arithmetic mean of the numbers of high-level 
(+1) samples for the two signals. Then 10% errors in the threshold settings result 
in errors of less than 1% in p. 

Another nonideal aspect of the behavior of the sampler and quantizer is that 
the threshold level may not be precisely defined but may be influenced by effects 
such as the direction and rate of change of the signal voltage, the previous sample 
value (hysteresis), and noise in the sampling circuitry. The result can be modeled 
by including an indecision region in the sampler response extending from a, — A to 
a+ A. Itis assumed that a signal that falls within this region results in an output that 
takes either of the two values associated with the threshold randomly and with equal 
probability. The three-level threshold diagram with indecision regions included is 
shown in Fig. 8.16. 


Fig. 8.16 Threshold diagram for a three-level correlator showing indecision regions and the 
shaded areas within them for which the response is nonzero. The figures +1, 3, and ; indicate 
the correlator response. The diagram shows the (X, Y) plane in which the signals are normalized 


to the rms value o. 
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The weighting in the indecision regions depends on the probability of the random 
sample values and is 1/4 when both signals fall within indecision regions, and 1/2 
when one signal is within an indecision region and the other produces a nonzero 
output. As before, the correlator output can be obtained by integrating the weighted 
probability of the signal values over the (X, Y) plane. Figure 8.17 shows the decrease 
in the correlator output as a function of A for several values of p, computed 
by expressing the output decrease as a Maclaurin series in A (D’Addario et al. 
1984). For all cases except those in which p approaches unity, the relatively small 
decrease in output results from the fact that when one input waveform falls within 
an indecision region, the other generally does not. For the particular case of p = 1, 
the input waveforms are identical and fall within these regions simultaneously. The 
output decrease is then proportional to A, as shown by the broken line in Fig. 8.17: 
However, this case is only of limited practical importance. For a 1% maximum error, 
A must not exceed 0.110, so the indecision region can be as large as +18% of the 
threshold value. For a maximum error of 0.1%, the above limits must be divided 
by \/10. Thus, the indecision regions have large enough tolerances that their effect 
may be negligible. 


8.6 Digital Delay Circuits 


Time delays that are multiples of the sample interval can be applied to streams of 
digital bits by passing them through shift registers that are clocked at the sampling 
frequency. Shift registers with different numbers of stages thus provide different 
fixed delays. A method of using two shift registers to obtain a delay that is variable in 
increments of the clock pulse interval is described by Napier et al. (1983). However, 
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integrated circuits for random access memory (RAM), developed for computer 
applications, provide an economical solution for large digital delays. 

Another useful technique is serial-to-parallel conversion, that is, the division of 
a bit stream at frequency v into n parallel streams at frequency v/n, where n is a 
power-of-two integer. This allows the use of slower and more economical types of 
digital circuits for delay, correlation, and other processes. 

The precision required in setting a delay has been discussed in Sect. 7.3.5 and is 
usually some fraction of the reciprocal of the signal bandwidth. In any form of delay 
that operates at the frequency of the sampler clock, the basic delay increment is the 
reciprocal of the sampling frequency. A finer delay step can be obtained digitally 
by varying the timing of the sample pulse in a number of steps, for example, 16, 
between the basic timing pulses. Thus, if an extra delay of, say, 5/16 of a clock 
interval is required, the sampler is activated 11/16 of a clock interval after the 
previous clock pulse, and the data are held for 5/16 of an interval to bring them into 
phase with the clock-pulse timing. Correction for delay steps equal to the sampling 
interval can also be made after the signals have been cross-correlated, by applying 
a phase correction to the cross power spectrum. 


8.7 Quadrature Phase Shift of a Digital Signal 


We have mentioned that complex correlators for digital signals can be implemented 
by introducing the quadrature phase shift in the analog signal, as in Fig. 6.3, 
and then using separate samplers for the signal and its phase-shifted version. The 
Hilbert transformation that the phase shift represents can also be performed on the 
digital signal, thus eliminating the quadrature network and saving samplers and 
delay lines, but the accuracy is limited. Hilbert transformation is mathematically 
equivalent to convolution with the function (—2t)~!, which extends to infinity in 
both directions [see, e.g., Bracewell (2000), p. 364]. A truncated sequence of the 
same form, for example, i, 0, 1,0, —1, 0, —4, provides a convolving function for the 
digital data that introduces the required phase shift. However, the truncation results 
in convolution of the resulting signal spectrum with the Fourier transform of the 
truncation function, that is, a sinc function. This introduces ripples and degrades the 
signal-to-noise ratio by a few percent. Also, the summation process in the digital 
convolution increases the number of bits in the data samples, but the low-order 
bits can be discarded to avoid a major increase in the complexity of the correlator. 
This results in a further quantization loss. The overall result is that the imaginary 
output of the correlator suffers spectral distortion and some loss in signal-to-noise 
ratio relative to the real output. These effects are most serious in broad-bandwidth 
systems, in which the high data rate permits only simple processing. Lo et al. (1984) 
have described a system in which the real part of the correlation is measured as 
a function of time offset, as described below for the spectral correlator, and the 
imaginary part is then computed by Hilbert transformation. 
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8.8 Digital Correlators 


8.8.1 Correlators for Continuum Observations 


In continuum observations, the average correlation over the signal bandwidth is 
measured, and data on a finer frequency scale may not be required. In such cases, 
the correlation of the signals is measured usually for zero time-delay offset. Digital 
correlators can be designed to run at the sampling frequency of the signals or at a 
submultiple resulting from dividing the bit stream from the sampler into a number 
of parallel streams. In the latter case, the number of correlator units must be pro- 
portionally increased, and their outputs can subsequently be additively combined. 
Two-level and three-level correlators, for which the products are represented by 
values of —1, 0, and +1, are the simplest. Correlators in which one of the inputs is 
a two-level or three-level signal and the other input is more highly quantized also 
have a degree of simplicity. In this case, the correlator is essentially an accumulating 
register into which the higher-quantization value is entered. The two-level or three- 
level value is used to specify whether the other number is to be added, subtracted, 
or ignored. In correlators in which both inputs have more than three levels of 
quantization, the multiplier output for any single product can be one of a range 
of numbers. One method of implementing such a multiplier is to use a read-only 
memory unit as a lookup table in which the possible product values are stored. The 
input bits to be multiplied are used to specify the address of the required product in 
the memory. 

The output of a multiplier can take both positive and negative values, and, 
ideally, an up—down counter is required as an integrator. Since such counters are 
usually slower than simple adding counters, two of the latter are sometimes used to 
accumulate the positive and negative counts independently. Another technique is to 
count, for example, —1, 0, and +1 as 0, 1, and 2, and then subtract the excess values, 
in this case equal to the number of products, in the subsequent processing. 

Spectral line (multichannel) correlators are used with most large general-purpose 
arrays. For continuum observations, they offer advantages such as the ability to 
reject narrowband interfering signals or to divide a band into narrower sub-bands to 
reduce the smearing of spectral details. 


8.8.2 Digital Spectral Line Measurements 


In spectral line observations, measurements at different frequencies across the signal 
band are required. These measurements can be obtained by digital techniques using 
a spectral correlator system, which is commonly implemented by measuring the 
correlation of the signals as a function of time offset. The Fourier transform of 
this quantity is the cross power spectrum, which can be regarded as the complex 
visibility as a function of frequency. (This Fourier transform relationship is a form of 
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the Wiener—Khinchin relation discussed in Sect. 3.2.) In the case of an autocorrelator 
(for use with a single antenna), the two input signals are the same waveform with a 
time offset. Thus, the autocorrelation function is symmetric, and the power spectrum 
is entirely real and even. However, the cross power spectrum of the signals from two 
different antennas is complex, and the cross-correlation function has odd as well as 
even parts. 

The output of a spectral correlator system provides values of the visibility at N 
frequency intervals across the signal band. These intervals are sometimes spoken of 
as frequency channels and their spacing as the channel bandwidth. To explain the 
action of a digital spectral correlator, we consider the cross power spectrum S(v) 
of the signals from two antennas, as shown in idealized form in Fig. 8.18. Here it is 
assumed that the source under observation has a flat spectrum with no line features, 
and the final IF amplifier before the sampler has a rectangular baseband response. 
In Fig. 8.18, we have included the negative frequencies since they are necessary in 
the Fourier transform relationships. For -Av < v < Av, the real and imaginary 
parts of S(v) have magnitudes a and b, respectively, and the corresponding visibility 
phase is tan™! (b/a). The cross-correlation function p(t) is the Fourier transform of 
S(v), where Tt is the time offset: 


0 Av 
p(t) = (a -m f eP™ dv + (a + | eP™ dy 
—Av 0 


(8.113) 


sin(27 Av T) 1 —cos(27Av T) 
= 2Av | a——___. — b —__—_——__ | 
2nAvt 2nAvt 


Thus, o(t) has an even component of the form (sin x)/x, which is related to the real 
part of S(v), and an odd component of the form (1 — cos x)/x, which is related 
to the imaginary part. The spectral correlator measures p(t) for integral values 


Fig. 8.18 Cross power By) 
spectrum S(v) of two signals 
for which the power spectra 
are rectangular bands 
extending in frequency from 
zero to Av. Negative 
frequencies are included. The 
solid line represents the real 
part of S(v) and the dashed 
line the imaginary part. The 
corresponding correlation 
function is derived in 

Eq. (8.113). 
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of the sampling interval t,. We consider the case of Nyquist sampling, for which 
Ts = 1/(2Av). The measured cross-correlation refers to the quantized waveforms, 
and the analysis in Sect. 8.4.1 shows how this is related to the cross-correlation of 
the unquantized waveforms. For correlation levels that are not too large, the two 
quantities are closely proportional, so for simplicity, we assume that Eq. (8.113) 
represents the behavior of the measured cross-correlation. The measurements are 
made with 2N time offsets from —NT; to (N — 1)t, between the signals, and Fourier 
transformation of these discrete values yields the cross power spectrum at frequency 
intervals of (2Nt,)~! = Av/N for Nyquist sampling. The N complex values of 
the positive frequency spectrum are the data required. Of these, the imaginary 
part comes from the odd component of the correlator output r(t). Thus, in the 
correlation measurement, it suffices to use single-multiplier correlators to measure 
2N real values of r(t) over both positive and negative values of t for one antenna 
with respect to the other. As an alternative to measuring only the real part of 
the correlation, complex correlators could be used to measure both the real and 
imaginary parts for a range of time offsets from zero to (V—1)t;. However, complex 
correlators require broadband quadrature networks. 

Measurement of the cross-correlation over the limited time offset range is 
equivalent to measuring r(t) multiplied by a rectangular function of width 2NT;. 
The cross power spectrum derived from the limited measurements is therefore 
equal to the true cross power spectrum convolved with the Fourier transform of 
the rectangular function, that is, with the sinc function 


sin(avN/Av) 
mv , 


(8.114) 


which is normalized to unit area with respect to v. Any line feature within the 
spectrum is broadened by the sinc function (8.114) and, depending on its frequency 
profile, may show the characteristic oscillating skirts. The width of the sinc function 
at the half-maximum level is 1.2Av/N, that is, 1.2 times the channel separation, and 
this width defines the effective frequency resolution. 

The oscillations of the sinc function introduce structure in the frequency 
spectrum similar to the sidelobe responses of an antenna beam. They result from the 
sharp edges of the rectangular function that multiplies the correlation function. Such 
sidelobes are undesirable and can be reduced by choosing weighting functions, other 
than rectangular truncation, that are constrained to be zero outside the measurement 
range. It is desirable that weighting functions should taper smoothly to zero at 
|t| = Nt, thereby reducing unwanted ripples in the smoothing (convolving) 
function, but also to be as wide as possible in order to keep the width of the 
smoothing function as narrow as possible. These requirements are not generally 
compatible, so weighting functions that produce smoothing functions with very 
low sidelobes have poor frequency resolution. Some commonly used weighting 
functions are listed in Table 8.5. 

Hann weighting, also known as raised cosine weighting, reduces the first sidelobe 
by a factor of 9 but degrades the resolution by 1.67, compared with uniform 
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Table 8.5 Commonly used weighting functions 


Half-amplitude 

Weighting width Peak 
function w(t) [w(t) = 0, |r] > t = Nr] (Unit=Av/N) sidelobe 
Uniform w(t) = 1 1.21 0.22 
Bartlett w(t) = 1 — (|t|/t1) 1.77 0.047 
Hann* w(t) = 0.5 + 0.5 cos(x T/T) 2.00 0.027 
Hamming w(t) = 0.54 + 0.46 cos(azt/T) 1.82 0.0073 
Blackman w(t) = 0.42 + 0.50 cos(xt/T) 2.30 0.0011 

+ 0.08 cos(27t/t1) 
Blackman-Harris w(t) = 0.35875 + 0.48829 cos(mzt/t1) 2.67 0.000025 


+ 0.14128 cos (27T /T1) 
+ 0.0106411 cos (37 T/T1) 
*Hann weighting is named after the nineteenth-century meteorologist Julius von Hann and is 


sometimes colloquially referred to as “Hanning weighting.” Hamming weighting is named after 
R. W. Hamming, an engineer at Bell Telephone Laboratories. 


weighting. The Fourier transform of the Hann weighting function is the sum of 
three sinc functions of relative amplitudes 0.25, 0.5, and 0.25. This is the smoothing 
function in the spectral domain shown in Fig. 8.19b, which corresponds to Hann 
weighting. For the usual case in which the number of points in the discretely 
sampled spectrum equals the number of points in the correlation function (i.e., no 
zero padding, as in the FX correlator, Sect. 8.8.4), the smoothing or convolution 
can be implemented as a three-point running mean with relative weights of 0.25, 
0.5, and 0.25. Thus, the smoothed value of the cross power spectrum at frequency 
channel n is given by 


y (nay) 1 (n—1)Av 1 nAv 1 (n+ 1)Av 


The Hamming weighting function is very similar to the Hann function and would 
appear to be superior because it produces a better resolution and a lower peak 
sidelobe level. However, the sidelobes of the Hamming smoothing function do not 
decrease in amplitude as rapidly as those of the Hann smoothing function. Weighting 
functions are discussed in detail by Blackman and Tukey (1959) and Harris (1978). 

A further effect of the finite time-offset range complicates the calibration of the 
instrumental frequency response in the following way (Willis and Bregman 1981). 
The frequency responses of the amplifiers associated with the different antennas 
may not be exactly identical, as discussed in Sect. 7.3. To calibrate the response 
of each antenna pair over the spectral channels, it is usual to measure the cross 
power spectrum of an unresolved source for which the actual radiated spectrum 
is known to be flat across the receiving passband. We can consider the result in 
terms of the idealized power spectra in Fig. 8.18. If no special weighting function is 
used, the real and imaginary parts are both convolved with the sinc function (8.114). 
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Fig. 8.19 (a) The ordinate is the sinc function sin(avN/Av)/(avN/Av), which represents the 
frequency response of a spectral correlator with channels of width Av/N to a narrow line at v = 0. 
The abscissa is frequency v measured with respect to the center of the received signal band. (b) The 
same curve after the application of Hann smoothing, as in Eq. (8.115). 


When a function with a sharp edge is convolved with a sinc function, the result is 
the appearance of oscillations (the Gibbs phenomenon) near the edge, as shown in 
Fig. 8.20. The point here is that the real component of S(v) in Fig. 8.18 is continuous 
through zero frequency, but the imaginary part shows a sharp sign reversal. Thus, 
near zero frequency, the observed imaginary part of S(v) will show oscillations 
that may be as high as 18% in peak amplitude, whereas the real component will 
show relatively small oscillations at that point (see also Fig. 10.14b and associated 
text). As a result, the magnitude and phase measured for S(v) will show oscillations 
or ripples, the amplitude of which will depend on the relative amplitudes of the 
real and imaginary parts, that is, on the phase of the uncalibrated visibility. The 
uncalibrated phase measured for any source depends on instrumental factors such 
as the lengths of cables as well as the source position, which may not be known. In 
general, the phase will not be the same for the source under investigation and the 
calibrator. Hence, near zero frequency, some precautions must be taken in applying 
the calibration. Possible solutions to the problem include (1) calibrating the real 
and imaginary parts separately, (2) observing over a wide enough band that the 
end channels in which the ripples are strongest can be discarded, or (3) applying 
smoothing in frequency to reduce the ripples. 

Another problem encountered when observing a spectral line in the presence 
of a continuum background is caused by reflections in the antenna structure. These 
reflections cause a sinusoidal gain variation across the passband, the period of which 
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Fig. 8.20 Convolution of a step function at the origin (broken line) with the sinc function 
(sin wx)/mx. Here, x = vN/Av, and the half-cycle period of the ripple is approximately equal 
to the width of a spectral channel. 


is equal to the reciprocal of the delay of the signal caused by the reflection. In a 
correlation interferometer, the magnitude of the ripple is a nearly constant fraction of 
the correlated continuum flux density, and the ripple is removed when the spectrum 
of the source under investigation is divided by the spectrum of the calibration source. 


8.8.3 Lag (XF) Correlator 


Correlators can be classified as two general types. In a lag (or XF) correlator, 
cross-correlation is followed by Fourier transformation, and in an FX correlator, 
Fourier transformation is followed by cross-correlation. A simplified schematic 
diagram of a lag correlator is shown in Fig. 8.21. Practical systems are often more 
complicated and are designed to take full advantage of the flexibility of digital 
processing techniques. The bandwidths of channels required for spectral line studies 
vary greatly, from a few tens of hertz to hundreds of megahertz. This versatility is 
necessary because the widths of spectral features are influenced by Doppler shifts, 
which are proportional to the rest frequencies of the lines and the velocities of the 
emitting atoms and molecules. The correlator of the upgraded VLA system (Perley 
et al. 2009) is fundamentally an XF design, as is the ALMA system, following its 
digital filter (Escoffier et al. 2000). 

A recirculating correlator is one that can store blocks of data and process them 
multiple times through the correlator. This can be done only when the correlator is 
capable of running faster than the incoming data rate. These multiple passes allow 
the number of correlator channels to be increased. For example, if data samples are 
processed by the correlator twice, the range of delays can be doubled, so the spectral 
resolution is improved by a factor of two. 

To implement the above scheme, recirculator units are required, which are 
basically memories that store blocks of input samples and allow them to be read out 
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Fig. 8.21 Simplified schematic diagram of a lag (XF) spectral correlator for two sampled signals. 
Ts indicates a time delay equal to the sampling interval and C indicates a correlator. The correlation 
is measured for zero delay, for the x input delayed with respect to the ĵ input (left correlator bank), 
and for delayed with respect to x (right correlator bank). The delays are integral multiples of ty. 


at the correlator input rate. These memory units are required in pairs, so that one 
is filled with data at the Nyquist rate appropriate to the chosen signal bandwidth, 
while the other is being read at the maximum data rate. One memory becomes filled 
in the time that the other is read for the required number of times, and the two 
are then interchanged. Examples of recirculating lag correlators are described by 
Ball (1973) and Okumura et al. (2000). The WIDAR correlator on the VLA uses 
recirculation (Perley et al. 2009). 
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8.8.4 FX Correlator 


The designation FX indicates a correlator in which Fourier transformation to the 
frequency domain is performed before cross multiplication of data from different 
antennas. In such a correlator, the input bit stream from each antenna is converted 
to a frequency spectrum by a real-time FFT, and then for each antenna pair, the 
complex amplitudes for each frequency are multiplied to produce the cross power 
spectrum. A major part of the computation occurs in the Fourier transformation, 
for which the total number of operations is proportional to the number of antennas. 
In comparison, in a lag correlator, the total computation is largely proportional to 
the number of antenna pairs. Thus, the FX scheme offers economy in hardware, 
especially if the number of antennas is large (see Sect. 8.8.5). The principle of the 
FX correlator, based on the use of the FFT algorithm, was discussed by Yen (1974) 
and first used in a large practical system by Chikada et al. (1984, 1987). Description 
of system designed for the VLBA are given by Benson (1995) and Romney (1999). 

Two slightly different implementations of the FX correlator have been used. In 
one, both in-phase and quadrature components of the signal are sampled to provide 
a sequence of N complex samples, which is then Fourier-transformed to provide N 
values of complex amplitude, distributed in frequency over positive and negative 
frequencies. In the other, N real samples are transformed to provide N values of 
complex amplitude. However, the negative frequencies are redundant, and only N/2 
spectral points need be retained. We follow the second scheme in the discussion 
below. 

Figure 8.22 is a schematic diagram of the basic operations of an FX correlator. 
The input sample stream from an antenna is Fourier transformed in contiguous 
sequences of length-N samples, where N is usually a power-of-two integer for 
efficiency in the FFT algorithm. The output of each transformation is a series of 
N complex signal amplitudes as a function of frequency. The frequency spacing 
of the data after transformation is 1/(Nt,), where t, is the time interval between 
samples of the signals. In the cross-multiplication process that follows the FFT 
stage, the complex amplitude from one antenna of each pair is multiplied by the 
complex conjugate of the amplitude of the other. These multiplications occur in 
the correlator elements in Fig. 8.22. Note that the data in any one input sequence 
are combined only with data from other antennas for the same time sequence. This 
leads to some differences in the effective weighting of the data in the FX and XF 
designs. 


8.8.5 Comparison of XF and FX Correlators 


Spectral Response. In the FX configuration, the F engine (DFT processor) operates 
on short segmented blocks of data in order to control the spectral resolution. The 
equivalent correlation function constructed from a block of data has N ways, or N 
possible multiplications for the zero lag component. There are progressively fewer 
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Fig. 8.22 Simplified schematic diagram of an FX correlator for two antennas. The digitized 
signals are read into the shift registers and an FFT performed at intervals of N sample periods. 
The correlator elements, indicated by C, form products of one signal with the complex conjugate 
of the other. In an array with na antennas, the outputs of each FFT are split (na — 1) ways for 
combination with the complex amplitudes from all other antennas. 


multiplications available for increasing lags because of the data block boundaries. 
The correlator function at the maximum lags of +(N — 1)t, can be obtained in 
only one way. Hence, the density of lag multiplications has a triangular shape as a 
function of lag over the range +Nt,, as shown in Fig. 8.23 (see also Moran 1976). 
Hence, the spectral response, the Fourier transform of this triangular function, is 
sinc?(Nt,v) = sinc? (n), where v = n/Nt, and n is the spectral channel number. An 
alternate derivation of this result is given in Sect. A8.4.1, where it is shown that the 
spectral response to a sine wave is a sinc” function. 

For the XF configuration, the spectral resolution depends on the length of the 
correlation function, calculated as described in Sect.8.8.2 Since the correlation 
function is calculated on a segment of data that is much longer than the block length, 
the density of lag multiplications is essentially uniform, except for a very small end 
effect. Hence, the spectral response is sinc(Nt,v) or sinc(7). The spectral responses 
for the FX and XF correlators are shown in Fig. 8.23. 

Note that the integral over frequency of both of these spectral responses is unity. 
Therefore, the flux or the area under a spectral line profile, the convolution of 
the source spectrum with the spectral response function, is conserved. The peak 
amplitude of a spectral feature narrower than the resolution will depend on where 
it falls with respect to the spectral channels. A line that falls midway between two 
channels will have its peak amplitude reduced by sinc?(1/2) = 0.41 for the FX 
processor compared with sinc(1/2) = 0.81 for the XF processor. This is the well- 
known effect called scalloping. It can be mitigated by the technique of padding with 
zeros to obtain an interpolated spectrum (see Sect. A8.4.2). 
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Fig. 8.23 (left) The density of lag calculations, or intrinsic weighting function, for an FX 
correlator (solid line) and an XF correlator (dotted line). N is the segment size for the FX correlator. 
For comparison purposes, the width of the function for the XF correlator is chosen to make the 
number of spectral channels the same. (right) The spectral response for the FX correlator (solid 
line), a sinc” function (see Appendix 8.4 for additional explanation) and the XF correlator (dotted 
line), a sinc function given by Eq. (8.114). Adapted from Romney (1999) and Deller et al. (2016). 


In some cases, it may be desirable to actually calculate the correlation functions 
from the output of the F engine, such as for application of a full nonlinear 
quantization correction. It is well known (Press et al. 1992) that it is necessary to pad 
the spectrum with N zeros in order to obtain the correct result [see discussion after 
Eq. (A8.40)]. The implementation of this calculation is discussed by O’Sullivan 
(1982) and Granlund (1986). 


Signal-to-Noise Ratio. The fundamental difference between the FX and XF pro- 
cessors is the density weighting in the lag domain. Both systems have the same 
number of equivalent multiplications, as can be seen in Fig. 8.23. The FX covers 
twice the number of lags as the XF processor but has lower density as the lag 
number increases. In particular, the FX provides half the lag density for k = N/2. 
For a continuum source, the signal-to-noise ratio of the FX and XF systems is the 
same. This can be appreciated by the fact that only the zero lag multiplications are 
important, and they are equal in both systems. Similarly, the signal-to-noise ratio for 
a very-narrow-bandwidth source, less than the resolution, is also the same because 
the total number of equivalent multiplications is the same. 

There is a small difference in response for signals that have line widths about 
equal to the resolution. In particular, for this case, the amplitude of a spectral 
line is reduced by a factor of about 0.82 (Okumura et al. 2001; Bunton 2005). 
This is a problem only for slightly resolved spectral features. In any event, most 
spectrometers are designed to produce several channels per resolution element in 
order to properly analyze the lines. This perceived deficiency in the FX correlator 
is due to the distribution of lags. The FX correlator has a larger range but fewer 
multiplications at lag(k) + N/2 (see Fig. 8.23). There are several approaches to 
recovering this loss of information. The classic method (Welch 1967; Percival and 
Walden 1993) is to overlap the segments in the block processing in the F engine. 
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A 50% overlap recovers most of the lost signal-to-noise ratio but at a cost of 
doubling the processing time in the F engine. This overlap feature was available in 
the original FX VLBA processors but rarely, if ever, used (Romney 1995). Another 
approach is to simply channel-average the spectrum, but this wastes the resolution 
capability of the F engine. Note that with polyphase filter banks, scalloping for 
narrow spectral lines and signal-to-noise ratio loss are very small. 


Number of Operations. We can make an approximate comparison of the workload 
requirements of XF and FX signal processors by comparing the number of 
multiplications needed in each system. For this rather simplistic analysis, we assume 
that the data are streams of real numbers at the Nyquist interval appropriate for 
bandwidth Av, i.e., t = 1/(2Av). To make this comparison, we further assume 
that the number of lags computed in the X engine (lag correlator), N, is equal to the 
data segment length into the F engine. This makes the spectral resolution of both 
systems approximately equal (see Fig. 8.23 for exact responses). 

Consider the analysis of one second of data, i.e., 2Av samples. For the XF 
system, a lag correlator is required for each baseline. Thus, 2N Av multiplications 
are required for each baseline. Since Nt; < 2Av, the edge effects in calculating 
the correlation function are negligible (i.e., all lags have almost the same number 
of multiplications approaching N), and the workload of the single Fourier transform 
at the end of the integration period is negligible compared with the workload of 
calculating the correlation function. Thus, the rate of multiplications (multiplies per 
second), rx, iS 


rxp = 2AVNnp , (8.116) 


where n, is the number of baselines. For the FX processor, one DFT engine 
is required for each antenna. We assume that the number of multiplications for 
the FFT implementation of the N-point DFT is N log, N. (Some variation exists, 
depending on the FFT implementation, e.g., an FFT with N being a power of four 
would run somewhat faster.) The cross power spectrum calculation requires the 
pairwise cross multiplication of the outputs of DFT engines for all baselines. These 
multiplications are complex, requiring four real multiplications each. In addition, 
only the N/2 spectral points at positive frequencies need to be calculated and 
retained. The number of multiplications is therefore [n,N log, N+-4Nn, /2|M, where 
M is the number of segments processed, 2Av/N. Since MN = 2Av, the aggregate 
multiplication rate is 


rex = 2Av ng log, N + 2ni] . (8.117) 
The workload ratio, R = rxp/Tgx is therefore 


N 
R= — _ (8.118) 
nalog, N + n, 
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The na log, N factor reflects the log, N advantage and the antenna-based processing 
of the DFT engine. The nN factor reflects the baseline processing of the X engine. 
Since ny = Nna(Na — 1)/2, we can rewrite Eq. (8.118) as 


(8.119) 


Note that this relation holds for na > 2 because no single antenna spectra 
are calculated. [Analysis for a spectrometer on a single antenna would yield 
R = N/(log, N + 1).] The limiting forms of Eq. (8.119) are 


R=N/2, na > 1+ 1log,N, 


Na — 1) rene (8.120) 
= — Na o . 
2log; N Ëz 


In general, the larger the values of N or na, the more the FX design is favored. 
For example, with na = 10 and N = 1024, R ~ 240. Perhaps the most 
important limitation in Eq. (8.119) is that the X engine operates on one or a few- 
bit representation of the signal and multiplication can be achieved by simple table 
lookup, whereas the F engine needs more bits per sample and there is additional bit 
growth in its internal operations. Furthermore, the detailed architecture of chips has 
a major influence on calculation speed. Hence, Eq. (8.119) is a useful guide for the 
general dependence of R on N and n, but does not accurately specify a crossover 
point favoring one design over the other. The advantage clearly shifts to the FX 
design for very large na or N. 


Digital Fringe Rotation. In early systems, fringe rotation was often applied to the 
signal as an analog process, but generally it is advantageous to implement it after 
digitization. For example, in VLBI observations in which the data are recorded as 
digital samples, it is useful to be able to repeat the analysis with different fringe 
rates if the position of the source on the sky is not known with sufficient accuracy 
before the observation. Digital fringe rotation is usually applied to the digitized 
IF waveform before it goes to the correlator and involves multiplication with a 
digitized fringe rotation waveform. It is desirable to use a multibit representation 
for the rotated data to maintain the required accuracy, and thus, the number of bits 
in the input data to the correlator may be increased. Increasing the number of bits per 
sample in a lag correlator results in a proportional increase in complexity. Thus, it 
may be necessary to truncate the data before input to the correlator, which effectively 
introduces the quantization loss a second time. In contrast, in the FX design, multibit 
data representation is required in the FFT processing, so the bit increase that fringe 
rotation presents is more easily accommodated. See Sect. 9.7.1 for more details. 


Fractional Sample Delay Correction. In digital implementation of the compensat- 
ing delays, one way of adjusting the delay in steps smaller than the sampling interval 
is to adjust the timing of the sampler pulses, as described in Sect. 8.6. Another way 
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of introducing a fractional sample period delay is done after transformation to the 
frequency domain by incrementing the phase values by an amount that varies in 
proportion to the frequency across the IF band. In the FX correlator, this is easily 
done because the signals appear as an amplitude spectrum every FFT cycle, and the 
correction can be applied as required for each antenna before the data are combined 
in antenna pairs. With a lag correlator, there are two problems in this process. First, 
the transformation to a spectrum occurs after the data are combined for antenna 
pairs, so many more values require correction. Second, for long baselines, the 
corrections required may occur more rapidly than the rate at which cross-correlation 
values are transformed to cross power spectra. Thus, it may be possible to apply only 
a Statistical correction rather than an exact one. See Sect. 9.7.3 for a description of 
the statistical corrections. 


Quantization Correction. The nonlinearity of the amplitude of the cross-correlation 
measured using coarsely quantized samples is seen in the Van Vleck relationship 
[Eq.(8.25)]. Application of a correction for the nonlinearity in quantization in 
the lag (XF) correlator is a relatively straightforward process because the cross- 
correlation values are directly calculated. To obtain the cross-correlation values 
in the FX correlator, the cross power spectrum at the correlator output must be 
Fourier transformed from the frequency domain to the lag domain. After applying 
the correction, the data must then be transformed back to a frequency spectrum. 
The correction is necessary only if the correlation of the total waveform (signal 
plus noise) is large for any pair of antennas. This condition implies observation 
of a source that is largely unresolved and sufficiently strong that the signal power 
in the receiver is comparable to the noise or greater. In the case of a spectral line 
observation, it is the power averaged over the receiver bandwidth that is important. 


Adaptability. The FX design is somewhat more easily expanded or adapted to 
special requirements because more of the system is modularized per antenna rather 
than per baseline, as in the lag correlator. Addition of an extra antenna to an FX 
correlator requires less modification of the reduction procedure than is necessary 
for a lag correlator. Thus, the FX design is convenient for projects in which the 
number of antennas is planned to increase over time and is more efficient for larger 
arrays (Parsons et al. 2008). 


Pulsar Observations. For pulsar observations, a gating system at the correlator 
output is required to separate data received during the pulsar-on period, so that 
the sensitivity is not degraded by noise received when the pulsar is off. For many 
pulsars, which have periods > 0.1 s, time resolution of order 1 ms is adequate 
in the gating.’ With an FX correlator, it is necessary to collect data in complete 
sequences of N samples, so the gating process has to accommodate data that arrive 


TMany arrays can also be used in a phased-array mode (e.g., for VLBI, see Sect. 9.9), which 
provides one signal output per polarization. A specially designed pulsar processor can then provide 
measurements with high time resolution for study of the pulse profile and timing. In such cases, 
the array is used only to provide a large collecting area for high sensitivity. See Sect. 9.9. 
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at time intervals of ~ N times the sample interval t,. For example, with N = 1,024 
and a total bandwidth of 10 MHz, Nt; ~ 500 us. Again, this might restrict 
flexibility for the fastest pulsars. However, a nice feature of the FX correlator is that 
complete spectra are obtained during each Nt, interval in time. In the subsequent 
time averaging, it is possible to process the frequency channels individually and to 
vary the time of the gating pulse for each one so as to match the variation in pulse 
timing that results from dispersion in the interstellar medium. 


Choice of Correlator Design. Because the relative advantages of the lag and FX 
schemes discussed above involve a number of different features, the best choice of 
architecture for any particular application may not be immediately obvious. Detailed 
design studies for different approaches, taking account of the precise requirements 
and the implementation of the very-large-scale integrated (VLSI) circuits, are 
required. For discussions of lag and FX correlators, see D’ Addario (1989), Romney 
(1995, 1999), and Bunton (2003). The widespread use of polyphase filter banks for 
precise channel definitions and radio frequency interference (RFI) excision favors 
the FX approach (see Sect. 8.8.9). 


8.8.6 Hybrid Correlator 


In designing a broadband correlator, it may be advantageous or necessary to divide 
the analog signal of total bandwidth, Av, from each antenna into ny contiguous 
narrow sub-bands. A separate digital sampler is used for each such sub-band, and 
the correlator is designed as ny sections operating in parallel to cover the full 
signal band. A system of this type that incorporates both analog filtering and digital 
frequency analysis is referred to as a hybrid correlator. If the digital part uses a lag 
design, then the rate of digital operations is reduced by a factor np relative to the rate 
for a lag correlator that processes the whole bandwidth without subdivision. This can 
be seen from Eq. (8.116), where for one sub-band, the bandwidth is Av, = Av/ny, 
the number of channels required is N/n,, but ng such sections of digital processing 
are required. We can write a cost equation for a hybrid correlator (Weinreb 1984), 
as 


= Avng(ng — 1)N 


C= + Aznyng + A3 , (8.121) 


n Lf 


where A, and A, are coefficients for the digital and analog hardware, respectively, 
and A3 is another constant. 
In this equation, the cost can be minimized with respect to ny, with the result that 


Ai 1/2 
n = [E avou = DN] l (8.122) 
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Table 8.6 Hybrid channelization 


Year Av Av, np 
Instrument Commissioned (GHz) (MHz) Reference 
SMA 2003 2.0° 104 24° Ho et al. (2004) 
SMA 2015 8.0 2000 4 — 
Plateau de Bure 1992 0.5 50 10 Guilloteau et al. (1992) 
Plateau de Bure 2013 8.0 2000 4 Jan (2008) 
ALMA 2013 4.0° 2000° 2 Escoffier et al. (2007) 


“ny is only approximately Av/Av, because of sub-band overlap. 

‘Two polarizations or two bands. 

“These sub-bands are subsequently reduced to 128 channels of 62.5 MHz each by digital filtering. 
Each 2000-MHz band can be positioned independently. 


Equation (8.122) is useful only if the digital electronics are fast enough to handle a 
bandwidth of Av/n,y. Over the last decades, the sampling rates have steadily risen 
and the costs have dropped for digital hardware, while the cost of analog electronics 
has remained relatively flat. The evolution of design in hybrid correlators can be 
seen in Table 8.6. A general disadvantage of the hybrid correlator is that very 
careful calibration of the frequency responses of the sub-bands is required to avoid 
discontinuities in gain at the sub-band edges. In general, it is advantageous to use the 
fastest samplers to minimize the analog filtering required. However, at millimeter 
wavelengths, where very wide bandwidths are needed and can be accommodated by 
receivers, the restriction on digital sampling speed requires some channelization. If 
an FX implementation is used for the digital section, a similar cost equation can be 
written, but there is less reduction in the number of operations since in Eq. (8.117), 
N enters logarithmically. 


8.8.7 Demultiplexing in Broadband Correlators 


The bit rate for the VLSI circuits used in large correlator systems is generally slower 
than that of the digital samplers that are used with broadband correlators. Serial-to- 
parallel conversion at the sampler output, that is, demultiplexing in the time domain, 
allows use of optimum bit rates for the correlator. Consider a system in which 
each sampler output is demultiplexed into n streams, and assume for simplicity 
that there is one bit per sample; parallel architecture accommodates multiple bits. 
Any n contiguous samples all go to different streams. To obtain all the products 
required in a lag correlator for a pair of IF signals with this configuration of the 
data, it would be necessary to include cross-correlations between each stream of 
one signal with every stream of the other signal. To simplify the system, Escoffier 
(1997) developed a scheme in which the n demultiplexed bit streams from each 
signal are fed into a large random access memory (RAM) and read out in reordered 
form. Each demultiplexed stream then contains a series of discontinuous blocks of 
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~ 10° samples. Each block contains data contiguous in time, as sampled. Cross- 
correlations are performed between data in corresponding blocks only. Thus, for 
any pair of input signals, n cross-correlators running at the demultiplexed rate are 
required for each value of lag. Also, each signal requires two RAM units so that 
one is filled as the other is read out. In Escoffier’s system, the sample rate is 4 Gbit 
s7}, n = 32, and the length of a block of the demultiplexed data is approximately 
1 ms. Since cross-correlations do not extend across the boundaries of any given 
block, there is a very small loss of efficiency, which in this case is about 0.2%. 
Another possible approach is based on demultiplexing in the frequency domain, 
as in the case of the hybrid correlator. It is then necessary only to cross-correlate 
corresponding frequency channels between each antenna, so the number of cross- 
correlators per signal pair is again equal to n for each lag. Carlson and Dewdney 
(2000) have described an all-digital development of the frequency demultiplexing 
principle used in the hybrid correlator. This is used with the expanded VLA (Perley 
et al. 2009), and the system is described as a WIDAR correlator. Broadband signals 
are digitized at full bandwidth, divided into frequency channels using digital filters, 
and resampled at the appropriate lower rate before cross-correlation between all 
antenna pairs. (The use of digital filters avoids the small differences in the responses 
of analog filters, which in some systems provide the initial channelization.) As a 
final step, the cross-correlated data are Fourier transformed to the frequency domain. 
This scheme is sometimes referred to as an FXF system. Both Escoffier’s reordering 
scheme and the WIDAR system of demultiplexing provide approaches to the design 
of large broadband correlators. The latter requires fewer lags because the digital 
filters provide part of the spectral resolution. 

For filtering sampled signals, digital filters of the FIR (finite impulse response) 
type can be used, in which the incoming sample stream is convolved with a series of 
numbers, referred to as tap weights, the Fourier transform of which represents the 
filter response (Escoffier et al. 2000). The tap weights can be stored in a RAM and 
readily changed as required. An important advantage of digital filters is the freedom 
from individual variations of the characteristics. However, it may be necessary to 
truncate the output data samples to match the number of bits per sample that can be 
handled by the correlator, and thus a further quantization loss may be incurred. 


8.8.8 Examples of Bandwidths and Bit Data Quantization 


The initial observing bandwidth of the 27-antenna VLA, when it came into 
operation in the early 1980s, was 100 MHz per polarization with three-level (2-bit) 
sampling. The expanded system that came into operation around 2010, covering 
a frequency range of 1-50 GHz, has a maximum observing bandwidth of 8 GHz 
per polarization with 3-bit sampling, or 8-bit sampling with a reduced bandwidth 
(Perley et al. 2009). This large increase in data capacity is possible as a result of the 
increase in computing speed and in signal transmission capacity using optical fiber. 
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The Atacama Large Millimeter/submillimeter Array, which came into operation in 
2012, covers bandwidths of 8 GHz per polarization with 3-bit (8-level) quantization 
(Wootten and Thompson 2009). The number of antennas is 64 and the correlator is 
FX with initial digital filtering, sometimes referred to as an FXF system. 

In the meter-wavelength range, observing bandwidths are generally narrower 
than at shorter wavelengths, but the spectrum is often more heavily used by 
transmitting services, so the requirement for avoiding or removing interfering 
signals is important. Larger numbers of bits allow for greater dynamic range in the 
system response, which helps to reduce the probability that interfering signals will 
cause overloading. The LWA (Long Wavelength Array) covers 20-80 MHz using 8- 
bit (256-level) sampling with the option of 12-bit (4096-level) sampling. The sample 
frequency is 196 mega-samples/sec (Ellingson et al. 2009). The LOFAR system 
covers 15-80 MHz and 110-240 MHz using 12-bit digitization (de Vos et al. 2009). 


8.8.9 Polyphase Filter Banks 


Polyphase filtering is a digital signal-processing technique that was developed 
for applications such as the separation of signals in multichannel communication 
systems with high interchannel rejection (Bellanger et al. 1976). The disadvantages 
of the nonoverlapping-segment discrete Fourier transform (DFT) processing, which 
we will call the single-block Fourier transform (SBFT) method, have been noted 
in earlier sections of this chapter and are also described in Appendix 8.4. Namely, 
this approach has high spectral leakage since the spectral response is a sinc-squared 
function that has sidelobe levels as high as —13.5 db. In addition, the amplitude 
of a monochromatic signal, or unresolved cosmic line, depends on its relative 
location with respect to the channel boundaries, going from | at channel center to 
(2/2) = 0.41 at the edge. This effect is called scalloping. There is a slight loss in 
sensitivity for signals whose line widths are close to the spectral resolution, which 
is related to the effective lag distribution of the DFT (see Sect. 8.8.5). 

Polyphase filtering and polyphase filter banks (PFBs) correct these deficiencies 
at a modest computational overhead. PFBs have become an important tool in radio 
astronomy as a way of excising radio frequency interference since they make 
possible the elimination of only the specific channels in which the interference 
occurs. It is also helpful in spectroscopic observations of some cosmic sources 
such as masers, where a very strong and narrow line in the passband makes it 
difficult to study other nearby lines because of the effect of spectral leakage. 
For detailed treatments of PFBs, see Crochiere and Rabiner (1981), Vaidyanathan 
(1990), and Harris et al. (2003). Useful tutorials are available by Harris (1999) 
and Chennamangalam (2014). For applications to radio interferometry, see Bunton 
(2000, 2003). 

Before describing the PFB, consider an elementary design of a digital filter bank 
based on a conventional analog filter bank with M equally spaced filters spanning the 
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frequency range 0 to Av. Suppose the input voltage is x(t), which is a bandlimited 
Gaussian process in the frequency range 0 to Av. x(t) can be represented by a digital 
sequence x(n), sampled at the Nyquist interval 1/2Av. A crude lowpass filter can 
be constructed by taking a running mean of M samples in the time domain. The 
spectral response to this “boxcar” averaging is a sinc function with its first null at 
2Av/M. Obtaining a perfect lowpass filter response in the frequency domain with a 
cutoff at ve = Av/M would require the convolution in the time domain with a sinc 
function [sinc(x) = sin(sx)/(sx)], 


h(t) = sinc(2v,t) = sinc(n/M) . (8.123) 


Note that for M = 1, A(t) = 1 forn = O and otherwise is zero, so x(n) 
remains unchanged. However, for M > 1, perfect lowpass filtering action requires a 
convolution over infinite time. As an approximation, we can use N-point smoothing. 
The filter shape will be 


N-1 
H(v) = $ sine(n/M) ef"/4) , (8.124) 


n=0 


which has a fairly sharp cutoff at v = Av/M. y(n), the smoothed version of x(n), 
will be oversampled by a factor of about M. The normal process at this point is 
to resample y(n), taking every Mth sample. This process is called decimation,® or 
downsampling. To make the rest of the filter bank, multiply x(n) by e/?””’, where 
v = m/Av and m = 1 to M —1, filter each stream by h(n), and downsample. 
This process is inefficient since the downsampling discards most of the arithmetic 
computations. The PFB provides a more efficient processing structure to obtain a 
filter shape with a sharp cutoff. 

We now describe the PFB, following the analysis of Bunton (2000, 2003). 
Consider a sample sequence of data, x(n), of length N, which is multiplied by a 
window function h(n). Its DFT is 


N-1 
X(K) = Do h(n)x(n) TO , (8.125) 


n=0 


where k ranges from 0 to N — 1. The frequency steps are 2Av/N, i.e., covering 
both positive and negative frequencies. If H(k), the DFT of h(n), has a width of 
approximately 2Av/M, then X(k) will be oversampled, and only r = N/M samples 
need be retained. N and M are chosen so that r is an integer. If H(k) is desired to be 


8“Decimation” formally means a reduction to 1/10; however, the broader definition is to “reduce 
drastically, especially in number.” 
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the discrete idealization of a perfect lowpass filter, i.e., 


A(k) =1, N-M<k<M, 
=0, otherwise , (8.126) 


hn) ~ si HSS 8.127 
n) ~ sinc x r|. (8.127) 


The decimated spectrum, i.e., taking every rth point of X(k) in Eq. (8.125), is 


then 


N-1 
XK) = hax) TOE, (8.128) 


n=0 


where k’ goes from 0 to M — 1. We can rewrite Eq. (8.128) as a double summation 
over r subsegments, each of length M, as 


r-1 M-1 
XK) = X XO h(n + mM) x(n + mM) eT OMNOHO (8.129) 
m=0 n=0 
Notice that 
ed (27/N)(n+mM)rk' = eT rank! /M) 4— jam! ; (8.130) 


The rightmost exponential factor is unity. Hence, 


r—1 M-1 
XK) = X X a(n + mM) x(n + mM) CTO (8.131) 


m=0 n=0 


In Eq. (8.131), there are r DFTs of length M, and in Eq. (8.128), there is one DFT 
of length N = rM, so there is only a slight reduction in the workload, approximated 
by the number of multiplications required. Note that the FFT algorithm, which has 
a workload proportional to M log, M, is used for the DFT calculation. 

The kernel of the exponential in Eq. (8.131) does not contain r, so we can 
interchange the order of summation and rewrite it as 


M-1[r-l 
Xk) =>% p h(n + mM) x(n + mo) eÍ O/M (8.132) 


n=0 Lm=0 


This step reduces the calculation from r DFTs of length M to one DFT of length 
M. The workload for applying the window function A(n) remains proportional 
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to N. Hence, the workload for Eq. (8.128) is N + N log, N, while the workload 
for Eq. (8.131) is N + M log, M. The workload is thus reduced by a factor of R, 
given by 


= N+WNlog,N _ 1+ log, N sg (8.133) 

N+Mlog,M 1+ 4 (log, N —log,r) 
where the approximation holds for N > 1. 

After the calculation in Eq. (8.132) is performed, the N-point window is moved 
by M steps, and the process is repeated. Each segment of M points is thus processed 
r times. Therefore, the input and output data rates are the same, except when spectral 
values at negative frequencies are discarded. 

The calculation in Eq. (8.132) is expressed diagramatically in Fig. 8.24. This 
process may seem counterintuitive in the following sense. The data stream is 
severely decimated by the action of the commutator, which distributes the time 
samples among the branches, or “partitions,” with a cycling period M. That is, the 
data samples into each of the M partitions are 


x(0), x(M), x(2M), ---, x(r*M—M) 

x(1), x(M +1), xM +2), --, xrM—M +1) 

x(2), x(M +2), x(M +3), ---, xrM—M +2) (8.134) 
x(M— 1), x(M), x(M+1), ---, x(rM-1). 


commutator 


input data 
stream: x(n) 


Fig. 8.24 A diagram of a polyphase filter bank, which converts a set of N data samples into an M- 
point spectrum. The input data stream is distributed among the M filter partitions by a commutator. 
Each partition receives a data stream that has been downsampled by a factor M. In each partition, 
Py represents the action of the decimated version of h(n), as described by the term in brackets in 
Eq. (8.132). The nonaliased M-point spectrum is assembled by the action of the FFT. Note that if 
the data samples are real numbers, then only M/2 values of the spectrum, corresponding to the 
positive frequencies, need be retained. 
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Each of these decimated data streams is undersampled by a factor of M, and its 
corresponding spectrum is heavily aliased. The action of the PFB undoes this 
aliasing. 

Consider an example where N = 1024, M = 256, andr = 4 (a four-tap 
polyphase filter), as shown in Fig. 8.25. The first polyphase partition, Po, calculates 
only the four-term sum x(0)h(0) + x(256)h(256) + x(512)A(512) + x(768)h(768), 
and P; calculates x(1)h(1) + x(257)h(257) + x(513)h(513) + x(769)h(769). 

We can now compare the performance and requirements of the SBFT and the 
PFB. The SBFT produces an M-point spectrum for each M data samples. It moves 
successively from block to block, so the data rate remains the same. The PFB takes 
in N data samples and produces an M-point spectrum and then steps by M samples 
for the next spectral calculation. Hence, its data rate also remains the same. The 
overhead in the PFB is due to the windowing. Hence, the workload ratio R needed 
for the PFB with respect to the SBFT is 


N+Mlog,M 
Ra TA OR ae. (8.135) 
M log, M log, M 


For M = 1024 and r = 4, there is a 40% overhead incurred with the PFB structure. 
The flat response on low leakage from the PFB is made possible because there are 
N samples available to provide the filter action rather than M. Note that in hardware 
implementation, the buffering requirement increases with r. 

It is advantageous to apply additional weighting to A(n) such as Hann, Hamming, 
or Blackman weighting to further reduce spectral leakage. This does not reduce the 
resolution significantly as long as the weighting function remains at a level of ~ 1 
over M samples. Examples of PFB and SBFT filter shapes are shown in Fig. 8.26. If 
the weighting is applied in the SBFT mode over M samples, the leakage is reduced, 
but the resolution is also reduced. 

Note that PFBs can be concatenated. The output of any subset or all of the 
channels of a PFB can be fed into an additional PFB to obtain finer resolution. 
The Murchison Widefield Array uses such a scheme. Another application is to use 
a PFB only for course channelization. Its output can then be fed to an XF or FX 
correlator. 


8.8.10 Software Correlators 


Since, in practice, the signals for which cross-correlations are formed are in digital 
form, having also been subjected to a digital delay system, the cross multiplication 
and averaging processes can be carried out in a computer system. This is useful in 
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Fig. 8.25 A graphical representation of the action of a polyphase filter bank with r = 4, or four 
taps. A random noise data stream represented by a set of N independent Gaussianly distributed 
random noise (i.e., white noise) is shown in the top panel. It is multiplied by a window function 
h(n), the envelope of which is shown in the next panel. Here, A(n) is chosen to be a sinc functions 
with exactly four zero crossings, equal to the number of taps. The result is separated into four 
segments, which are coadded to form the M-term time series shown in the lowest panel, which 
is then Fourier transformed into an M-point spectrum, — Av to Av, as formulated by Eq. (8.132). 
After this calculation, the window is moved by M samples, and the process repeated. Adapted from 
Gary (2014). 


Appendix 8.1 Evaluation of ER? (qt;) 375 


co 


-60 


Log Power (db) 


-100 


Frequency (normalized units) 


Fig. 8.26 The thick line shows the response of a filter element in a PFB having r = 8 and Hann 
weighting applied. The thin line shows the response for a SBFT, a sinc? function. Both filters have 
a response of about (2/7)? at the filter edge of 0.5 in normalized frequency units of 2Av/M. 


small systems for which the development of special correlator hardware is avoided. 
Also, in the case of large systems in which antennas are brought into operation 
over a period of years, changes in the correlation requirements are more easily 
accommodated. An example of a software correlator and the advantages of the 
design are described by Deller et al. (2007). Most VLBI processing is done in 
software correlators. 


Appendix 8.1 Evaluation of £72 R3 (4Ts) 


The periodic function f(t) can be expressed as a Fourier series as follows: 


mt ijona] om 


q=! 


p 
a= 5} fO cos (7 £) dt, (A8.2a) 


p 
b= 2| f(sin (=*) dt. (A8.2b) 
0 
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Parseval’s theorem for Eq. (A8.1) takes the form 


oe) 


a 
a f(t) dt = = + Doia +b). (A8.3) 


q=1 


Now let f(t) be a series of rectangular functions of unit height and width, one 
centered on ¢t = 0 and the others centered on integral multiples of +f. Then, one 
obtains 


B 2 _ 2 sin(zq/B) = B 5 _ 
AT Bh aR ngap b,=0, | Poaz. 
(A8.4) 
From Eqs. (A8.3) and (A8.4), 
weve) -1 
—, A8.5 
3 | 4/8 2 Ran 


q=1 


which, from Eq. (8.15), is the summation needed to evaluate Eq. (8.19). 


Appendix 8.2 Probability Integral for Two-Level 
Quantization 


The probability integration required in Eq. (8.21) can be performed as follows. The 
integral is 


5 +y — 2pxy) 


220 = p2) | dxdy . (A8.6) 


1 [0.0] 
Pi = — | f 
2x0? y1— p? Jo Jo 


Restore circular symmetry in the integral by the substitutions 


aa a dy = VI- Pdz. (A8.7) 
=p 


207 ai a Ta 


Next, substitute x = rcos@ and z = rsin 0. The lower limit of the z integral in 
Eq. (A8.8) represents the line z = —px/y/ 1 — p°, which makes an angle 6 with 
the x axis given by 0 = —sin™! p. The integral covers an area of the (x, z) plane 


Then 


2 
Pu = al a (A8.8) 


o2 


Appendix 8.3 Optimal Performance for Four-Level Quantization 377 


between this line and the z axis (9 = 2/2). Thus, 


1 love) m/2 —r2 
Pii = zzl af or exp (=) dé. (A8.9) 
Finally, substitute u = 12/207: 
1 [ee] T/2 
Pi = =| au f e “dé. (A8.10) 
20 0 —sin—! p 


Equation (A8.10) can be integrated directly to give 


1 1 
Pi, = JT x sin p. (A8.11) 


Appendix 8.3 Optimal Performance for Four-Level 
Quantization 


Schwab (1986) has investigated various aspects of the performance of correlators 
with four-level quantization. These include precise values for optimal thresholds and 
quantization efficiencies, and expressions for computation of the cross-correlation 
as a function of the correlator output. The threshold values and the efficiencies are 
given Table A8.1. 

The values of quantization efficiency n4 for n = 3 and 4 are within 0.3% of the 
highest value and are useful because nonintegral values of the weighting factor n 
would require more complicated implementation in a lag-type correlator. Rational 
approximations for the cross-correlation ð are minimax solutions; that is, they 
minimize the maximum relative error. The variable ry is the normalized correlator 
output, that is, the measured output divided by the corresponding output for p = 1. 
The first three approximations given below are valid for all |ry| < 1. 

For n = 3 and the corresponding value of v9/o in Table A8.1, the following 
approximation yields a maximum relative error of 1.51 x 1074: 


1.1347043 — 3.097131272, + 2.9163894r4, — 0.8904769375, 


a 2.6892104r2, + 2.4736683r4, — 0.72098190r%, 


(A8.12) 


For n ~ 3.3359 and the corresponding value of vo/o in Table A8.1, the following 
approximation yields a maximum relative error of 1.46 x 1074: 


1.1329552 — 3.1056902r2, + 2.9296994r4, — 0.90122460r5, 


p(n) = 1 — 2.7056559r2, + 2.5012473r4, — 0.73985978r, 


FN . 


(A8.13) 
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Table A8.1 Optimal thresholds and efficiencies 
for four-level quantization 


n v9 /o N4 
3 0.99568668 0.8811539496 
3.3358750 0.98159883 0.8825181522 
4 0.94232840 0.8795104597 
For n = 4 and the corresponding value of vo/o in Table A8.1, the following 


approximation yields a maximum relative error of 1.50 x 1074: 


1.1368256 — 3.0533973r7, + 2.8171512r4, — 0.85148929r$, 


7 q 6 TN - 
1 — 2.6529114r2, + 2.4027335r4, — 0.70073934r6 


(A8.14) 


P(tn) = 


The following approximation also applies for n = 4 and the corresponding value of 
vo/o in Table A8.1 but is valid for only |ry| < 0.95. It yields a maximum relative 
error of 2.77 x 107°: 


(ty) = 
1.1369813 — 1.248789172, + 4.5380174 x 10724, — 91448344 x 10737 


1 — 1.0617975r2, ave 


(A8.15) 


Appendix 8.4 Introduction to the Discrete Fourier Transform 


This appendix provides a brief introduction to the discrete Fourier transform (DFT), 
with specific emphasis on applications important to topics covered in this book. For 
more comprehensive discussion, see Bracewell (2000) or Oppenheim and Schafer 
(2009). 

Consider the Fourier transform integral of a function x(t), a bandlimited signal 
(0 to Av), which has finite duration T. 


T 
X(v) = f x(t) edt. (A8.16) 
0 


We can approximate this integral as 


N-1 
XM) 2A > er (A8.17) 


n=0 
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where x(t,,) is a sampled version of x(f) at the Nyquist interval A = 1/2Av so that 
ta = nA. For simplicity, we assume x(t) to be a real function. We calculate X(v) at 
a set of N frequencies, vy = 2kAv/N, where k = 0 to N — 1, as 


N-1 
XW S AY ee (A8.18) 


n=0 
The important next step is to use Eq. (A8.18) as the basis for a definition of the DFT 
by writing 
>> eee (A8.19) 


where x, are the samples x(t,), and X% are the corresponding spectral components, 
X (vx). For k = 0, 


= Xoan , (A8.20) 


which corresponds to the component of X at v = 0. For k = N/2, 
Xv = 2 eum = Eac 1)”, (A8.21) 


which corresponds to X at v = Av. The negative frequency components lie between 
k = N/2 and N — 1, and Xy = Xo. The inverse DFT is 


n= Dx ere (A8.22) 


We can show that Eq. (A8.22) is indeed the inverse discrete transform of Eq. (A8.19) 
by substituting Eq. (A8.19) into Eq. (A8.22), that is, 


N- 
Xn iS G x e ePTIN (A8.23) 


k=0 


We introduced time index £ to distinguish it from n. Interchanging the order of 
summation gives 


= 1S (> erka an . (A8.24) 
N 0 
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In the summation in parentheses in Eq. (A8.24), the phasor steps uniformly around 
the complex plane exactly n — £ times. Hence, 


N-1 

1 j 

> C = ne , (A8.25) 
k=0 


where 6,,¢ is called the Kronecker delta function, which has the properties 


bn = 0, n#l, 
ai, ek; (A8.26) 


The Kronecker delta function is nonzero only for £ = n, so Eq. (A8.24) yields 
Xn = Xn and demonstrates that x, is recovered from the original data after a DFT 
followed by an inverse DFT. Note that x, and X; are both periodic with period N. 
Thus, 


Xk+mN = Xk (A8.27) 
and 
Xn+mN = Xn , (A8.28) 


where m is the period number. Thus, for example, Xy = Xo = 2 Kis 

A very useful concept is to think of x, and X; as lying on a circle instead 
of on a line. Most of the well-known theorems in Fourier transform theory have 
counterparts in DFT theory where the data lie on a circle. The shift theorem of 
Fourier transforms illustrates the circular nature of the DFT. The DFT of a circularly 
shifted, by one step, sequence of xn, 1.€., Yn = Xn—1, OF 


y = {Xn—-1,.%0,X15X2,-..,Xw—2} , (A8.29) 
is 
N-1 
Lay eo (A8.30) 
n=0 
N-1 
= ae Fani (A8.31) 
n=1 
N-2 
= er ety (A8.32) 


n=0 
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N=2 

= eE N > yp e PEN E ay, (A8.33) 
n=0 
N-1 

— eÎ?7K/N Xoan emi 2rkn/N f (A8.34) 


n=0 


The last step of absorbing xy_ into the summation was accomplished by recogniz- 
ing that the xy—; term of the summation in Eq. (A8.34) is 


ETERN yi e NON mcg (A8.35) 
Thus, 
Y, = PKN, | (A8.36) 
In general, for a shift of £ steps, 
Yn = Xn , (A8.37) 
the DFT becomes 
Yp = PKN Ca (A8.38) 


Hence, the shift theorem for DFT is clearly based on a circular shift of x,. It is 
straightforward to prove the cyclic convolution and correlation theorems: 


1 

Xn * Yn = y“ Y; ’ (A8.39) 
DFT 1l 

Xn * Yn <— y“ Ype (A8.40) 


It is important to understand that to use expressions (A8.39) and (A8.40) to 
calculate either convolution or correlation function, it is necessary to pad the spectral 
array with N zeros to avoid unwanted products in the circular correlation (see 
Sect. A8.4.2). 

Parseval’s theorem for the DFT can be easily proved, by using Eq. (A8.19) and 
writing 


k=0 Ln=0 


N-1 N-1 [N-1 N-1 
Sa = be aad È xe a . (A8.41) 
k=0 £=0 


382 8 Digital Signal Processing 


Interchanging the order of summations gives 


N-1 N-1N-1 N-1 
YAX ay ae ee. (A8.42) 
k=0 n=0 (=0 k=0 
The rightmost sum is proportional to the Kronecker delta function [Eq. (A8.25)], so 
N-1 N-1 
5 XX =N > as (A8.43) 
k=0 n=0 
If x, is complex, then the general form of Eq. (A8.43) becomes? 
N-1 ) 2 
2 2 
Xl = XI. A8.44 
2 bal? = = 2 [Xl (A8.44) 


A8.4.1 Response to a Complex Sine Wave 


We now calculate the DFT response to a complex sine wave, 


4 n 
Xn = el?” i ty = . 
2Av 


(A8.45) 


N 
We introduce a normalizing frequency v’ = = 7 For the positive frequency 
v 


range of 0 to Av, v’ ranges from 0 to N/2. Note that v’ does not have to be an 
integer. The DFT of x, is 


N-1 
ye een, (A8.46) 


n=0 


We use the formula for the sum of a geometric series, 


N-1 
l= 

5 y'= A (A8.47) 
l1=y 


°In some DFT formulations, the N factor of Eq. (A8.22) appears in Eq. (A8.19). In that case, the M 
factor in Eq. (A8.44) moves to the numerator on the left side. 
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where y = e7"k-V9/N to write Eq. (A8.46) as 


_ ed2a(k—-v') 


Xk (A8.48) 


~ 7 epakv)IN * 


By factoring out e/**-¥ from the numerator and e/7—")/" from the denominator 
of Eq. (A8.48), we can write 


ae eae a (A8.49) 
k | erev sin (a(k — v’)/N) | ` ` 
We are interested in the power response 
sinz(k- v) 7 
EN N P es Se A8.50 
«= [X4 ao ( ) 
This is the circular form of the sinc function, which repeats on the interval N, that 
is, Xk+N = Xk, or Sern = Sk. 
We approximate the denominator of Eq. (A8.50) as x(k — v’)/N, so that 


Sp = |X? ~ Nsinc?(k—v’). (A8.51) 


If v’ = m, an integer, then 


S TaN (A8.52) 


In Fig. A8.1, we show the response to the complex sine wave for v’ = m and 
v’ = m+ 1/2. Unless v’ corresponds exactly to a DFT channel, S; will be nonzero 
in every channel, demonstrating the problem of spectral leakage. 

The DFT of Eq. (A8.50) gives the corresponding response for the correlation 
function, which is a triangle function. This reflects the fact that the number of ways 
the correlation function, given by Eq. (A8.39), can be computed from a segment of 
data decreases linearly from N ways for lag zero to one way for lag N — 1, as shown 
in Fig. 8.23. If it is desired to calculate the correlation function from the power 
spectrum, i.e., Eq. (A8.39), it is important to note that the spectrum must be padded 
with zeros to length 2N. 
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Fig. A8.1 (a) The response of the DFT of length N to a complex sine wave of frequency v’ = m. 
This is a plot of Eq. (A8.49) without the phase factor. The continuous envelope [e.g., as calculated 
from Eq. (A8.54)] is shown along with the function values at the sample points. One repetition of 
this periodic function is shown. (b) Response where v’ = m + 1/2 (the frequency falls midway 
between two DFT channels). (c) The data array size has been increased from N to 4N by padding 
with zeros, which results in a more finely defined spectrum. 


A8.4.2 Padding with Zeros 


Padding with zeros is a very important concept in DFT theory. Padding with zeros 
means adding a block of zeros to the data sequence, usually at the end, to increase 
its length from N to N’. The three main reasons to pad with zeros are: 


1. It provides a way to interpolate X; and define it at more finely spaced frequency 
intervals. 

2. If N is not a power of two, x, can be padded with zeros to N’, a power of two. 
This makes the FFT used to calculate the DFT more computationally efficient. 
The N’ spectrum is a properly interpolated spectrum in the Nyquist sense. 

3. Ifa linear correlation function of two functions is to be properly calculated from 
XnY, Via the circular convolution theorem [Eq. (A8.39)], then X,, and Y,, must be 
padded with zeros to 2N to avoid unwanted multiplications. 
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To understand how interpolation is achieved, consider a data sequence of length 
N to which M zeros are added, giving a sequence of length N’ = N + M. The DFT 
of the new data set is 


= a e` i2rkn/N" m y 0- ei2akn/N’ 
n=0 n=N 
N-1 
= Xorn ene 0<k<N-1, (A8.53) 


n=0 


where r = N’/N. The frequency spacing interval is now 2Av/Nr. Hence, if r = 2, 
the spectrum X% gives a halfway interpolation of the unpadded version of Xx. 

Padding with zeros can provide arbitrarily fine definition of X;. However, it is 
often helpful to define a continuous spectrum associated with the discrete series 
Xn as 


N-1 
x=) ner (A8.54) 


n=0 


where v’ = v/Av. This is sometimes called the discrete time Fourier transform 
(DTFT). It can be calculated for arbitrary values of v’. 

It is often useful to load the FFT in such a way as to avoid unwanted phase factors 
from appearing in the transform. For example, suppose we want to calculate the 
spectrum from an autocorrelation function R(t) = R,, where n = 0, N/2 ranging 
over positive delays only. Load R, into the array 


R= Res n=0,N/2, 
i-n E Ria (A8.55) 
The DFT of R’, will have a real valued spectrum S% in the positive frequencies 


k = 0 to k = N/2, with the negative frequencies being in k = N to N—N/2-—1, as 
shown in Fig. 8.2. 
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Fig. 8.2 An example of the loading of an autocorrelation function into a DFT array. (top) A 
continuously defined autocorrelation function (left) and its power spectrum (right) via Fourier 
transform. (bottom) Positive lags loaded into 0 to N/2 indices, and negative lags loaded into 
N/2 to N — 1 indices and its spectrum (right) via DFT. Loading the data in this manner gives a 
real power spectrum. Zero padding should be done in the middle of the delay values to keep the 
spectrum real valued. 
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Chapter 9 
Very-Long-Baseline Interferometry 


In 1967, a new technique of interferometry was developed in which the receiving 
elements were separated by such a large distance that it was expedient to operate 
them independently with no real-time communication link. This was accomplished 
by recording the data on magnetic tape for later cross-correlation at a central 
processing station. The technique was called very-long-baseline interferometry 
(VLBI), a term recalling the earlier long-baseline interferometers at Jodrell Bank 
Observatory, in which the elements were connected by microwave links that had 
reached 127 km in length. The principles involved in VLBI are fundamentally 
the same as those involved in interferometers with connected elements. The tape 
recorder and its successor, disk storage, can be considered as an IF delay line 
of limited capacity with an unusually long propagation time, weeks instead of 
microseconds. The use of tape and disk recording media is motivated entirely by 
economics and places substantial limitations on the system. Satellite links have been 
demonstrated (Yen et al. 1977), but their high cost discourages their use. 

Tape recorders have been entirely replaced by compact disks. Data can also be 
transmitted to correlation facilities via the Internet in quasi real time. However, 
latency and throughput are significant issues, and data buffering is usually required. 


9.1 Early Development 


The motivation to develop VLBI came from the realization that many radio sources 
have structures that cannot be resolved by interferometers with baselines of a 
few hundred kilometers. By the mid-1960s, it was well known that scintillation 
(discussed in Chap. 14) and time variability of the radiation from quasars implied 
angular sizes of < 0.01”. Maser emission from OH molecules at 18-cm wavelength 
was unresolved at 0.1”. Low-frequency burst radiation from Jupiter was believed to 
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emanate from regions of small angular size. The aim of the first VLBI experiments 
was to measure the angular sizes of these radio sources. It is instructive to consider 
the operation of these early VLBI experiments in their most primitive form. 
Consider two telescopes with system temperatures Js; and 752, which are pointed 
at a compact source giving antenna temperatures 74; and T42. Each station records 
N data samples within the coherence time, that is, the interval during which the 
independent oscillators remain sufficiently stable that fringes can be averaged. In the 
subsequent processing, these data streams are aligned, cross-correlated, and time- 
averaged after removing the quasi-sinusoidal fringes. The expected correlation for 


a point source is 
Tal 
po = | e mn, (9.1) 
(Ts. + Tai) (Ts2 + Ta2) 


where 7 is a factor of value ~ 0.5 to account for losses due to quantization and 
processing (see Sect. 9.7). Here, it is convenient to consider a normalized form of 


the visibility: 
Ts Ts 
yee i (9.2) 
po n \ Di 


where p is the measured correlation, and we assume 7, < Ts. The rms noise level 
is 


1 1 
Ap x —= ~x —=——. 9.3 
E JN JS2Avt 0.3) 
where Av is the IF bandwidth, and te is the coherent integration time. Hence, from 
Eqs. (9.1)—(9.3), the signal-to-noise ratio is 


P Tai Ta2 
— = Vy, | ——~QAvr,) . (9.4) 
Ap aS Ts Ts2 


If the minimum useful signal-to-noise ratio is 4, the smallest detectable flux density 
is as follows, from Eqs. (1.3), (1.5), and (9.4): 


8k |TT 1 
Vyn \ AA /2Avt, ’ 


(9.5) 


S min ~ 


where k is Boltzmann’s constant, and A; and A, are the antenna collecting 
areas. Typical parameters in 1967 were A ~ 250 m? (25-m-diameter telescope), 
Ts ~ 100 K, n ~ 0.5, and N = 1.4 x 108 bits (one bit per sample), the capacity 
of a tape at a density of 800 bpi (bits per inch) used in the NRAO Mark I system, 


9.1 Early Development 393 


which was based on standard IBM compatible technology. For an unresolved source, 
Smin œ 2 Jy. The development after three decades is indicated by the following 
parameter values: A ~ 1600 m? (64-m-diameter telescope), Ts ~ 30 K, and 
N = 5 x 10! bits, the capacity of an instrumentation tape operated at 64 MHz 
bandwidth. For Vy = 1, Eq. (9.5) gives Smin ~ 0.6 mJy. In both examples, the 
coherence time is assumed to be greater than the running time of the tape. The 
source size can be estimated from a single measurement of Vy by comparison with 
the visibility expected for a symmetric Gaussian model. Hence, as in Fig. 1.5, the 
full width at half-maximum, a, is given by 


2/in2 
ae — J nV; , (9.6) 
u 


where u is the projected baseline (in wavelengths). 

VLBI can be used only to study objects of exceedingly high intensity. Thus, 
the emission processes must normally be of nonthermal origin. To be detected on 
a baseline of length D, the source must be smaller than the fringe spacing. Since 
the flux density S is 2kTg2 /à?, where Tg is the brightness temperature, A is the 
wavelength, and (2 is the source solid angle, the minimum detectable brightness 
temperature is 


2 
(TB) min = PP Smin ’ (9.7) 


since 2 ~ 2(A/2D)*. If D = 10° km and Spin = 2 mJy, then (Tg)min ~ 10° K. 
Therefore, observations of thermal phenomena occurring in molecular clouds, 
compact HII regions, and most stars are generally not possible. On the other hand, 
synchrotron sources such as supernova remnants, radio galaxies, and quasars, which 
are limited to 10!? K by Compton losses; masers in which Tg ~ 10!> K; and pulsars 
can be readily studied. 

Three things were accomplished by early VLBI measurements: 


1. Simple intensity distributions were derived by comparing measured visibilities 
with source models. 

2. The distribution of the various spectral components of masers was mapped by 
comparing fringe frequencies for different spectral features. 

3. Source positions were measured to an accuracy of ~ 1” and baselines to an 
accuracy of a few meters. 


For a review of early techniques, see Klemperer (1972). Since then, the technique 
has moved steadily toward the mainstream of interferometry in terms of being able 
to produce reliable images of complex radio sources. The principal reason for this 
is the use of phase closure (see Sect. 10.3), which provides most of the phase 
information when a large enough number of antennas is available in the VLBI 
network. A list of various VLBI networks is shown in Table 9.1. 

It is interesting to note that the correlation of data in the earliest systems was 
accomplished in software on general-purpose computers. After about 30 years, 
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Table 9.1 Examples of VLBI arrays* 


Antenna Maximum 


size baseline Frequency Weeks/ 
Name Inception Center Stations (m) (km) (GHz) Year Ref. 
EVN? 1980 Europe 18 10-100 8000 0.3-86 12 1 
VLBA‘ 1993 USA 10 25 8610 1.3—86 52 2 
LBA‘ 1997 Australia 6 22-70 3100 1.3—22 3 
CVNS 2000 China 5 25-65 3250 1.4-22 4 
VERA! 2005 Japan 4 20 2273 6—43 52 5 
KVNE 2011 Korea 3 21 476 22-129 52 6 
LOFAR? 2012 Europe 8 ~ 50 1300 0.15-0.24 52 7 
EHT! 2012 USA 3 10-30 4700 230 1 8 
Ivsi 1980 Global 32 10-100 10000 2 and 8 26 9 


*There are also networks of networks, such as the KaVA, which is a combination of VERA 
(VLBI Exploration of Radio Astronomy) and KVN (Korean VLBI Network). 

>Buropean VLBI Network. 

“Very Long Baseline Array. 

‘Long Baseline Array, which often operates with Hartebeesthoek in South Africa and the 
Warkworth Telescope in New Zealand. 

Chinese VLBI Network. First antenna commissioned at Sheshan in 1987. 

‘VLBI Exploration of Radio Astrometry. 

Korean VLBI Network, dedicated to astrometry and geodesy. Dual-beam capability part of 
larger Japanese VLBI Network (JVN) [see Doi et al. (2007)]. 

nLOw Frequency ARray. 

‘Event Horizon Telescope. 

JInternational VLBI Service. 

References: 1, Porcas (2010); 2, Napier et al. (1994); 3, Edwards (2012); 4, Zhang et al. (2012); 
5, Kobayashi et al. (2003), VERA (2015); 6, Lee et al. (2014); 7, van Haarlem et al. (2013); 8, 
Doeleman (2010); 9, Behrend and Baver (2012). 


during which correlation was done with custom-built hardware, this task has largely 
reverted back to general-purpose computers because of the rapid growth of their 
capabilities (Deller et al. 2007, 2011). 


9.2 Differences Between VLBI and Conventional 
Interferometry 


In this section, we briefly discuss the differences between VLBI and connected- 
element interferometry. Later sections in this chapter elaborate on these differences. 
Before beginning, we emphasize the theoretical unity of interferometry. The 
fundamental aim of all interferometry is to measure the coherence properties of 
the electromagnetic field. Thus, the principles of connected-element interferometry 
and VLBI are basically identical. However, certain special techniques used in VLBI 
are needed because of the particular observational constraints. As the continuity 
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of (u,v) coverage is improved, from a few meters to more than 10° km (with 
the largest spacing achieved by elements on distant satellites), and fiberoptic or 
other advanced communication systems make recording unnecessary, the concept 
of VLBI as a distinct technique will become a matter of history. Here, we deal with 
certain limitations that make classical VLBI practices somewhat distinct from those 
of connected-element interferometry. 

Early VLBI experiments were conducted by organizing a diverse group of 
observatories that had been constructed for general radio astronomical research. 
Each telescope had its own limitations, calibration procedures, and management 
personnel. Various networks were formed to standardize procedures and automate 
the execution of VLBI experiments. Such ad hoc VLBI networks operated on an 
intermittent basis, and during observations, the communication between elements 
to verify proper operation was limited. Small amounts of data from strong sources 
could be transmitted from the antennas to the correlator over telephone lines 
and cross-correlated to determine the instrumental delays and to check that the 
equipment was working properly. Later, arrays dedicated to VLBI were brought 
into operation [see, e.g., Napier et al. (1994)]. 

In VLBI, one has less control over the system stability because independent 
frequency standards are used at each element. Frequency offsets in the standards 
can cause instrumental timing errors. These errors usually include an epoch error of 
a few microseconds and a drift of a few tenths of a microsecond per day (Sect. 9.5). 
Therefore, the correlation function of the received signals [with respect to time 
offset, tr, as defined in Eq. (3.27)] must be measured to determine and track the 
instrumental delay. In contrast, delay errors in connected-element interferometers, 
due mainly to baseline errors and atmospheric propagation delays, are usually less 
than 30 ps, corresponding to 1 cm of path length. These errors are negligible 
for bandwidths less than 1 GHz. Thus, the response in connected-element, delay- 
tracking interferometers is always centered on the white light fringe. Delay becomes 
important only when the field of view becomes too large for the bandwidth (see 
Sects. 2.2 and 6.3) or when spectral line measurements are made by introducing time 
offsets. In VLBI, it is necessary to search a range of delay values to find the correct 
time relationship that maximizes the correlation. Correlations for a number of delay 
offsets are usually formed simultaneously, so a VLBI correlator may resemble a 
digital spectral correlator, although the number of frequency channels may be less 
than generally used for spectral line observations. The frequency offsets in the stan- 
dards, which cause drifts with time in the instrumental delay, also introduce offsets 
in the fringe frequency. Thus, analysis of a VLBI experiment must begin with a two- 
dimensional search in delay and fringe frequency (delay rate) to find the peak of the 
correlation function. This process is referred to as fringe finding (see Sect. 9.3.4). 

The concept of coherence has different implications in VLBI and connected- 
element interferometry. In connected-element interferometry, there is generally a 
suitable calibration source within a few degrees of the source of interest that can 
be observed every few minutes. Even if the instrumental phase drifts, there is 
no fundamental limit on integration time, and the concept of coherence time is 
replaced by that of the interval between calibrations. In VLBI, the use of calibrators 
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to extend coherence time is more difficult because the short-term phase stability 
(t < 10° s) is worse. Atmospheric fluctuations above the stations are generally 
completely uncorrelated, and the frequency standards and frequency multipliers 
introduce phase noise in the fringes. Furthermore, a fundamental difference between 
connected-element interferometry and VLBI comes from the fact that there are 
many fewer sources that are unresolved at VLBI spacings and that can be used 
as calibrators. It is not always possible to find a calibrator close enough to the 
source under investigation to use as a phase reference. The time required to repoint 
the antennas and the decorrelation introduced by the atmosphere both increase 
with angular spacing. Thus, VLBI is subject to a fundamental coherence time that 
limits its sensitivity. For integration beyond the coherence time, it is necessary to 
average the fringe amplitudes, for which sensitivity improves only as the fourth 
root of the integration time (Sect. 9.3.5). It is also more difficult to calibrate 
phase in VLBI systems, although the situation has steadily improved as enhanced 
sensitivity has increased the number of sources that can be used as calibrators. 
Improved instrumental phase stability and more accurate modeling of the baselines, 
atmosphere, and similar factors have allowed the phase to be related to that of a 
calibrator several degrees away. Phase referencing in this manner is discussed in 
Sect. 12.2.3, and an example is shown in Fig. 12.1. Phase information can also be 
derived from phase closure analysis. In measuring positions, fringe frequency and 
group delay (the delay pattern effect discussed in Sects. 2.2 and 6.3.1) have also 
proved useful as measurement quantities. 

Storage of the undetected signals before correlation presents VLBI with several 
problems. The average IF bandwidth is limited by the recording medium, which 
therefore limits the sensitivity of VLBI. The data must be stored as efficiently as 
possible, which requires a coarsely quantized representation of the signal, sampled 
at the Nyquist rate. With such a representation, the basic operations of fringe rotation 
and delay tracking, when performed on the recorded data, introduce significant 
effects that must be allowed for in deriving the visibility (Sect. 9.7). 


9.2.1 The Problem of Field of View 


In most VLBI applications, the ratio of the extent of the source under study to the 
resolution is typically less than about 107 (see Figs. 1.19-1.21). It is interesting to 
consider the challenge of imaging the entire primary beam of the antennas used in a 
VLBI observation. Consider an array of the following parameters: 

D (longest baseline) = 4000 km 

d (antenna diameter) = 25 m 

N (number of array elements) = 10 

v = 10 GHz (A = 3 cm) 

Av = bandwidth = 1 GHz 

Tops (observation time) = 12 hrs. 
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The nominal resolution is à /D, or 1.5 mas, and the field of view is AO ~ 4/d, or 
250”. Hence, the number of pixels required for an image (at 2 pixels per resolution 
element) of the entire primary beam is 


D 2 
No œr (2) BAS TOS, (9.8) 


Note that N, is independent of wavelength because the resolution and field of view 
both scale as wavelength. 

The processing and data storage requirements are considerable because of the 
large range of geometric delay and fringe rate that must be covered. The geometric 
delay is t, = D cos @/c, where 0 is the angle between the baseline vector and the 
source direction. Thus, the range of delay over the primary beam is D sin 0 AO/c, 
and the maximum delay range requirement is 


D 
At me = 9.9 
Tg, Jd (9.9) 


At the Nyquist sampling interval of (2Av)~!, the number of lags in the correlation 
function needed to cover this range will be 


(7) (>) 
N: =2|— || — 1., (9.10) 
d v 


which is about 30,000 for our example. 
The fringe rate, in Hz, œ(dt;/dt)/27, is Dwe sin 0/À, where we = 1/T, and Te 
is the Earth’s sidereal period. This leads to a range of fringe rates of 


Avpmax = (2) (=) (9.11) 


which requires a minimum sampling time of (A Vima) or about 34 ms. Thus, the 
number of fringe rate samples in time Tobes = Te/2 is about 2.9 x 10°. The total 
amount of data in the delay—fringe rate domain on N(N — 1)/2 ~ N? baselines is 


TZN ; (9.12) 
d v 


For our case, Nr ~ 5 x 10!? samples. With 2 bytes/sample and complex numbers, 
the minimum storage requirement would be about 160 Tbytes. 

Because of the high brightness requirement of VLBI, most of the primary beam 
field will be largely empty but may contain a significant number of compact sources. 
A simple approach would be to image these sources with separate passes through 
the data processing system with separate field centers for each source. The advent 
of software correlators has provided a more efficient approach. The data from the 
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correlation step (correlation functions of 30,000 lags at interval cadence of 34 ms, 
in our case) can be shifted to various phase centers and the resulting data streams 
reduced in volume substantially before imaging. The details of the phase center 
shifting, called “(u, v) shifting,” are described by Morgan et al. (2011). This process 
can be embedded in the software architecture without the need for intermediate 
storage of the entire delay—fringe rate data set. An implementation is described by 
Deller et al. (2011), and an example is shown in Fig. 9.1. 
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Fig. 9.1 An example of the multiple field center imaging technique with data from the EVN at 
1.6 GHz. “P-Centre” is the pointing center of the individual antennas, and the circle shows the 
primary beam size (FWHM) of a 32-m-diameter antenna. The phase calibrator is J2229+0114. 
Fifteen other sources were detected in the field, and the images of three of them are shown in the 
inset panels. The contour levels start at the 30 level and increase by factors of /2. From H.-M. Cao 
et al. (2014), reproduced with permission. © ESO. 
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9.3 Basic Performance of a VLBI System 


9.3.1 Time and Frequency Errors 


A block diagram of a basic VLBI system and a generic processor configuration is 
shown in Fig. 9.2. The atomic frequency standards control the phases of the local 
oscillators and the clock pulses for sampling the data. In many VLBI applications, 
such as spectral line observations or astrometric programs, frequency-dependent 
effects must be accounted for precisely. To understand the spectral response of the 
system, we consider the phase shifts encountered by a single frequency component. 
The signals received from a plane wave are e/?*"! at antenna 1, which we designate 
as the time-reference antenna, and e/2""—"s) at antenna 2, where Tg is the geometric 
delay. The local oscillators have phases 27 vpot + 6, and 2r vrot + 02, where vio 
is the local oscillator frequency, and 6; and 6, are the slowly varying terms that 
represent the phase noise due to the frequency standards. To start, we consider 
the upper-sideband response in Fig.9.2, for which the local oscillator frequency 
is below the signal frequency. Thus, the phases after mixing are 


go)? = 2n(v — vo)t— 01 , 


9.13 
D (9.13) 
2 


= 2r (v — yo)t — 2T VT — b2 . 


The recorded signals each have clock errors t; and t2, so the phases of the recorded 
signals are 


o” = 2n(v — vo)(t— 1) — 01, 


(9.14) 
pP — 27 (v — VLo)(t — T2) = 2TAVTg = 65 È 


During processing, the time series of signal samples from antenna 2 is advanced by 
Tas the estimate of T,, so 


P = 2r (v — Wo)(t— T + th) — 20 VT —  - (9.15) 


The output of the multidelay correlator and Fourier transform processor is the cross 
power spectrum. The phase at the output of the processor for the signal component 
at frequency v is 


én = T — 62) 


= 2n(v — vL0)(t2 — T1) + 24 (VAT, + VLOT,) + O21 
= 2n(v = VLO) (Te F ATx) + 2T VLOTg + O21 P (9.16) 
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Fig. 9.2 Block diagram of the essential elements of a VLBI system, including data acquisition 
and processing. The system may pass the upper, lower, or both sidebands at the mixer inputs, 
depending on the passband of the amplifiers. For millimeter-wavelength observations, there may 
be no amplifier preceding the mixer, in which case both sidebands may be accepted. Quantization 
and sampling of the signals occur in the format units. The processor system shown illustrates 
the configuration described analytically by Eqs. (9.21)—(9.26). Major variations in the processing 
system relate to the relative positions of the correlator, fringe rotator (see Fig. 9.21), and FFT 
operation in the correlator. 


where At, = tT, is the delay error, Te = T—T is the clock error, and 62; = 0):— 
6,. Equation (9.16) applies to the upper-sideband frequency conversion in the mixers 
in Fig.9.2, for which the intermediate frequency (IF), (v — vto), is positive. For 
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generality, we also give the lower-sideband response, for which the IF is (vto — v). 
For the lower sideband, 


o12 = 20 (vro — V) (Te + ATg) — 2T VLOTg — 21 . (9.17) 


Note that in the ideal case where t} = 12, 0) = 62, and Tẹ = Ties Eqs. (9.16) 
and (9.17) reduce to ġ12 = 27 VLOT for the upper sideband, and ¢)2 = —27 VLOTg 
for the lower sideband. 

The correlation function at the correlator output is real, but not even; thus, the 
cross power spectrum Sj, for a source of continuum radiation has the property 


Srv’) = Shv’), (9.18) 


where v’ is the intermediate frequency (v — vio). We assume that the filters in the 
electronics have identical responses and therefore do not introduce any net phase 
shifts. The power response function of the instrumental filters is therefore real, and 
in terms of the voltage response, H(v), of the filters for the two antennas, S(v’) = 
H(v')H5(v’). By combining the phase from Eq. (9.16) and the magnitude of the 
power response, the cross power spectrum for the upper sideband can be written 


Si2(v’) = S(v’) exp {i [2 v'(te + AT) + 2T VLOT + A>, | i (9.19) 
The corresponding equation for the lower sideband can be obtained from Eq. (9.17). 


For the upper sideband, the cross-correlation function can be calculated from 
Eqs. (9.18) and (9.19) as 


p(T) = J Sohe dv . (9.20) 


For either sideband, integration includes both positive and negative frequencies, and 
since S12 is Hermitian and S is purely real, we obtain 


pi(t) = 2F\(t') cos(2 VLOTg + 021) — 2Fo(t’) sin(2a VLOT + O21) , (9.21) 
where T’ = T + Te + At, and 
[o.e) 
F(t) =| S(v’) cos(2v't)dv’ , 
0 


F (9.22) 
F(t) 7) S(v’) sin(2xv't)dv" . 
0 
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If S(v’) is a rectangular lowpass spectrum with bandwidth Av, then 


sin 27 Avt 
F\(t) = Av —— 
(7) 2n AVT 
(9.23) 
sin’ m AVT 
F(t) = Av 
mwAvt 


These functions are shown in Fig. 9.3. By substituting Eq. (9.23) into Eq. (9.21), the 
cross-correlation function can be written 


inrAvr 
p12 (T) = 2Av cos(2 T VLOT + 621 + rAvr) T Gan (9.24) 
mAvt! 
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Fig. 9.3 Functions F; (t) and F(t), defined in Eq. (9.23), and the quantity ,/ F? (t) + F3(t). 
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A similar analysis is given by Rogers (1976). 

The variation of t, with time results in fringe oscillations at the correlator output. 
The fringe frequency, (1/2z)d@12/dt, is constant across the receiver bandwidth 
because the (instrumental) delay tracking removes the (geometric) delay-induced 
phase variation across the band. For the upper and lower sidebands, the rate of 
change of phase has opposite signs; note the term 27 v_oTg in Eqs. (9.16) and (9.17). 
See also Fig. 6.5 and the related discussion. In VLBI, the natural fringe frequency 
is fast enough that the fringes would be lost in the final averaging of the correlated 
data, so rotation of the phase to stop the fringes is applied before the correlator in 
Fig. 9.2. In a double-sideband system, if the fringes are stopped for one sideband, 
the fringe frequency is doubled for the other sideband. However, it is possible to 
obtain the data from each sideband by processing the data twice with appropriate 
fringe offsets each time. In VLBI, the source position and other parameters are not 
always known with sufficient accuracy when the observation is made, so in Fig. 9.2, 
the fringes are stopped after recovery of the data streams to permit trial of different 
fringe rotation rates. This involves applying a phase shift to the quantized signals at 
the correlator input or output (see Sect. 9.7.1). The effect on the cross-correlation 
function or the cross power spectrum can be described as multiplication by e~?7"/° 
for the upper sideband and filtering to select the low-frequency term. This process 
results in a complex correlation function: 


sin m Avt 
pi2(T) = Av exp |j (2a vroAtg + 621 + nAvt')]| ——— . (9.25) 
mAvT 
Note that the principal fringe term, 27v_ot,, has been eliminated, but residual 
fringes can result from terms in At, and Av. The resulting cross power spectrum is 


Sia’) = S(v’) exp {j [27v'(t. + Ate)2 VLA, + Or1]} . (9.26) 


This applies to the upper sideband, for which the fringes have been stopped, and the 
correlator output for the other sideband averages to zero. 

An example of p},(t) for eight values of t is shown in Fig. 9.4. The waveforms 
represent the correlator output as a function of time for eight different delay offsets 
(lags) that differ sequentially by one Nyquist sample interval. Note that there is a 
phase shift of 2/2 between adjacent delay steps. The fringe phase can be recovered 
by a proper interpolation (see Sect. 9.7.3) to the peak of the correlation function, 
or from the phase of the cross power spectrum at v’ = 0. The group delay can 
be derived from the position of the correlation peak or the slope of the phase of 
the cross power spectrum. Note that the measured delay is (1/27)d@2/dv and is 
therefore a group delay, not a phase delay. 

The actual local oscillator frequencies may differ from the nominal value vio 
due to an intentional offset from the nominal frequency or due to an offset error in 
the frequency standard. We can expand the phase terms 0; and 6, to include these 
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Fig. 9.4 Each sinusoid represents the correlation function [the real part of Eq. (9.25)] vs. time 
for a particular delay offset (from the top: i, 3, 3, L, =i 3, -3, —3 times the Nyquist 
interval). The oscillations result from the residual fringe frequency, which includes any offsets 
in the frequency standards at the two antennas. Note the progressive phase shift of 90° between 


values of the correlation function at successive delay offsets. 


frequency offsets, Avı and Av, and zero-mean phase components, 6; and 0): 


6, =2rAvt+ 0i, 


(9.27) 
b = 2r Avt + 0}. 
Thus, the fringe phase from Eq. (9.26) becomes 
hi2 (v) = 2r [v (te + Ate) + VLoATg + Avrot] + 4, , (9.28) 


where Avro = Av — Av, the difference in the local oscillator frequencies, and 
= 65 — 6. The fringe frequency (1/2)d@\2/dt contains this local oscillator 
difference term. If Av, is due to an offset in a frequency standard and is not zero, 
the measured fringe phase is actually more complicated than shown in Eq. (9.28). 
The clock error changes with time because of the frequency standard offset and is 


Ay 
T = (t1):1=9 + — t . (9.29) 
VLO 


The recovered time in the processor, based on the time of station 1, is related to the 
“true” time t by 


Avı 
ti = (t1)<0 + (1 + — |ż, (9.30) 
VLO 
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so that there is a slight shift in all measured frequencies and phases. Thus, there is a 
fundamental asymmetry in the processing between the reference station from which 
time is derived and the other stations (Whitney et al. 1976). 

For spectral line observations, the quantity S(v’) in Eq. (9.26) is the (temporal 
frequency) spectrum of the visibility of the source multiplied by the bandpass 
response of the interferometer. The bandpass response can be obtained by obser- 
vation of the cross power spectrum of a continuum source with a flat spectrum. 
Alternately, if the phase responses of the interferometer elements are identical, the 
bandpass response can be obtained from the geometric mean of the power spectra 
from the individual elements. These power spectra are obtained by observing a 
continuum source or blank sky and measuring the autocorrelation of the waveform 
from each individual antenna. The frequency spectrum of the normalized visibility 
can be obtained by dividing the visibility spectrum by the geometric mean of 
the power spectra of the source as measured with each antenna. To correct 
for nonidentical phase responses, it is necessary to measure the complex power 
spectrum on a strong continuum source. Details of calibration procedures in VLBI 
spectral line observations are given by Moran (1973), Reid et al. (1980), Moran and 
Dhawan (1995), and Reid (1995, 1999). 


9.3.2 Retarded Baselines 


The estimate of delay tz must be accurate enough to ensure that the signal is within 
the delay and fringe-frequency ranges of the processor. The simplest approximation 
is 


ga ets. (9.31) 
c 

where D = r; —rp, rı, and r; are vectors from the center of the Earth to each station, 
and sọ is the unit vector to the center of the field. Account must be taken of the fact 
that the Earth moves in the time between the arrival of a wave crest at one station 
and at another, since the Earth is not an inertial reference. Therefore, in calculating 
the delay, we should use not the instantaneous baseline but the “retarded” baseline 
(Cohen and Shaffer 1971). A plane wave reaches the first station at time t; and the 
second station at a time t, which satisfies the equation 


k. rı(tı) —2nvt =k. r2(t) —2nVh , (9.32) 
where k = (277/A)so. Now th — ti = Tg, SO 


2TVT = k [ro(t) + Te) — rı (4 )] . (9.33) 
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Expansion of rz in a Taylor series gives 
ro(ti + Tg) X r2(t1) + Po(ti)te +=, (9.34) 
where the dot over rz indicates the derivative and 
2TVT X k + [D(t) + f(t) te] - (9.35) 


Solving for t yields 


D. So So: i2 = 
te = l= : (9.36) 
c c 
where all quantities are evaluated at tı. Since r = we X r, where @, is the angular 
velocity vector of the Earth and x indicates the vector cross product, we can rewrite 
Eq. (9.36) as 


D. «(a l 
Te a So [i = So (@ x > ; (9.37) 
c Cc 
or 
Ty = teo(1 + A), (9.38) 


where | + A is the term in brackets on the right side of Eq. (9.37). From the w term 
in Eq. (4.3), 


D 
Tego = — [sin d sin ô + cos d cos ô cos(H —h)] . (9.39) 
c 


Here (H, 8) and (h, d) are the hour angle and declination coordinates of the source 
and baseline, respectively, the hour angles usually being specified with respect to 
the Greenwich meridian in VLBI practice. Also, we have 


Wer 


A= 


cos L> cos ô sin(hy — H) , (9.40) 


where £2, h2, and rp are the latitude, hour angle, and magnitude of r2, where œe 
is the magnitude of we. The function A has a maximum value of 1.5 x 1076, and 
To can differ from tg9 by a maximum of about 0.05 js. Note that the appropriate 
coordinates in Eq.(9.39) are those that are uncorrected for refraction or diurnal 
aberration. An equivalent way of accounting for the retarded baseline is to use 
Eq. (9.31) for the delay but correct h and ô for the diurnal aberration at the 
remote site. We introduced the concept of retarded baseline mainly for pedagogical 
purposes. It does not appear explicitly when interferometry variables are calculated 
in a heliocentric frame. 
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There are different ways to formulate VLBI observables. One system that may be 
described as station-oriented is to refer the measurements to the center of the Earth, 
so that if recordings from two antennas are processed once and then interchanged 
and reprocessed, the phase obtained on the second pass will be the negative of 
that obtained on the first pass. This method presupposes an Earth model, since 
the radius vectors must be known. For applications to astrometry or geodesy, a 
baseline-oriented system is usually preferred, in which the observables have no 
dependence on a priori values of Earth parameters. A more precise discussion of 
VLBI observables can be found in Shapiro (1976) and Cannon (1978). For a full 
barycentric formulation, see Sovers et al. (1998). 


9.3.3 Noise in VLBI Observations 


In VLBI, it is often necessary to identify and calibrate the fringe visibility in 
situations of low signal-to-noise ratio and short coherence time. In such cases, 
a thorough understanding of the noise properties of interferometers can be very 
useful. The properties of the fringe amplitude and phase were briefly introduced 
in Sect. 6.2.4. We now develop this discussion further [see Moran (1976) and 
Hjellming (1992)]. The measured visibility is represented by a vector Z = V + e, 
where V and e represent the true visibility (the signal) and noise components, 
respectively. We select coordinates with x (real) and y (imaginary) so that V lies 
along the x axis, as shown in Fig. 6.8. There is no loss in generality by having 
Y lie along the x axis. The phase of the measured visibility resulting from the 
noise is a random variable denoted by ¢. The components of e have independent 
zero-mean Gaussian probability distributions in their x and y coordinates, with an 
rms deviation o given by Eq. (6.50). In polar coordinates, the amplitude of e has 
a Rayleigh probability distribution, and the phase of e has a uniform probability 
distribution. Z is therefore a random variable whose x and y components, Zy and Zy, 
have a probability distribution given by 


(Z. —|VI)? + Z; 
P(Z,, Zy) = 2102 exp = z š (9.41) 
We convert this probability distribution to polar coordinates, 
Z; = Z cosġ (9.42a) 
Z =Z sing, (9.42b) 


by noting that the Jacobian of the transformation is simply |V| [see, e.g., Sivia 
(2006)] and obtain the result 


p(Z.ġ) = (9.43) 


IV| = _(Zcosġ + IVI)? + Z? sin? p 
Ino? P 202 i 
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where Z = ,/Z? + Z. 
The marginal distribution of Z is given by 


pZ= f poa. (9.44) 
which, as in Eq. (6.63a), is 
Z Z + |V) Z|V 
p(Z) = = exp m Io 7 ; Z>0 s (9.45) 
o 20 o 


where Io is a modified Bessel function of order zero, which is defined by 
1 É xcos 0 
lox) = — e` dé. (9.46) 
T Jo 


p(Z) is known as the Rice distribution. 
The marginal distribution @ is 


FOr J OEA (9.47) 


which becomes 


() 1 Iv]? p 1 |V]|cosġ |V|? sin? $ 
= — exp | -— —== — exp | —-— 
p Qn P 202 A/ST o P 202 


Pees) 


where erf is the error function defined in Eq. (6.63c). Note that p(ġ) is an even 
function of ¢, as expected, since the phase of V was set to zero. Hence, (¢) = 0. 
p(ġ) was first derived in the interferometry literature by Vinokur (1965). Equa- 
tions (9.45) and (9.48) correspond to Eqs. (6.63a) and (6.63b). However, here we 
have written p(¢) in a slightly different but equivalent form to make its asymptotic 
behavior more obvious. These probability distributions are plotted in Fig. 6.9. 

The expectations of Z, Z?, and Z* are 


ge ere cdi | (eta yd Pe 
= y 9° *P\ G52 202 } °° (402 202 1 \ 4g2 J |? 


(9.49) 


(9.48) 


(Z?) = [VP + 207 , (9.50) 
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and 
(Z$) = |V|* + 807|V|? + 804 , (9.51) 


where J; is the modified Bessel function of order one, defined by 
l : xcos 0 
h(x) = — e*S" cos d0 . (9.52) 
T Jo 


Higher even-order moments of Z can be readily calculated using the moment 
theorem for a Gaussian random distribution. When no signal is present, i.e., when 
IV| = 0, (0) = 1, and the probability distributions of Z and ¢ are those of the 
noise, which are Rayleigh and uniform distributions, respectively: 


(Z) = a as Z>0 (9.53) 
aaa a : : 
and 
1 
P= —., 0<¢<2z7. (9.54) 
27 


For the no-signal case, 


(Z) = Va/2o0 , (9.55) 
oz = y (Z?) — (ZP = o y2 = x/2, (9.56) 

and 
gs, (9.57) 


For the weak-signal case, defined as |V| « o, we use the approximations 
Io(x) ~ 1 + x?/4 and I, (x) ~ x/2. The probability distributions of Z and @ are 


Ces 2) by VE ZL (9.58) 
Be gt p 202 2 o? 4\ o2 ` 
and 
1 1 |V| 
eo pial 9.59 
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to first order in |V|/o. Thus, 


2 
(Z) ~ o fz (1 = w) , (9.60) 
2 
oz ~ 02-2 (1+ Z) À (9.61) 


and 


o > — (9.62) 


2 
S18 
oS 
| 
als 
Noe 


Note that Z departs from a Rayleigh distribution slowly as |V|/o increases, whereas 
the probability distribution of ¢ is confined to a spread (full width at half-maximum) 
of only 110° and 70° for |V|/o equal to 1 and 2, respectively (see Fig. 6.9). Hence, 
as a practical matter, it is often easier to identify a weak signal by its phase rather 
than by its amplitude, as shown in Fig. 9.5. 
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Fig. 9.5 A simulated visibility spectrum of a source with a single spectral line with a Gaussian 
profile of amplitude equal to 2 and centered at 100 MHz (solid line). The spectral resolution is 
1 MHz, and o = 1 (hence |V|/o = 2) at line center. This demonstrates that weak signals can be 
more easily identified (by eye) in phase than in amplitude. 
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For the strong-signal case, |V| >> o, Io(x) ~ e*/V2mx. The probability 
functions for Z and ġ are approximately Gaussian distributions and are given by 


po eal ee 
PO) = Tea \ IV | 20? l ice 


apt IYI IVg? 
p) = Dr o exp (- 702 ) : (9.64) 


and 


For this case, 


o2 
Z x yv 1+ —— l, 9.65 
=I (1+ r) (9.65) 
o? 
~ 1- ——]}., 9.66 
a o( awe) an 
and 
oO 
0g ~—. (9.67) 
a iV 


The quantities oz and o¢ for the full range of |V|/o are shown in Fig. 9.6. Hence, in 
the strong-signal case, the statistics of Z are approximately Gaussian (see Fig. 6.9), 
and (Z) approaches |V]. In this case, N samples of Z can be averaged, and the 
signal-to-noise ratio improves with vN. In the weak-signal case, the perturbation 
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Fig. 9.6 The values of oz/o and og as a function of |V|/o. Approximate expressions for 
|V|/o < 1 are given in Eqs. (9.61) and (9.62) and for |V|/o >> 1 in Eqs. (9.66) and (9.67). 
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of the Rayleigh noise distribution by the signal is small, and as we shall discuss in 
Sect. 9.5, it is difficult to improve the signal-to-noise ratio by averaging beyond the 
coherence time of the system. 


9.3.4 Probability of Error in the Signal Search 


When starting a new session of VLBI observations with an ad hoc array, the first 
task in the processing is to search for fringes, i.e., fringe finding. This is necessary 
because of the uncertainties in the station clocks and their drift rates and means 
that the instrumental delay and fringe frequency must be found. This step is often 
unnecessary with a dedicated VLBI array, for which the values of fringe rate and 
delay are continuously updated from successive observations. A fringe search must 
be carried out on a large two-dimensional grid, as shown in Fig. 9.7. For example, 
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Fig. 9.7 Fringe amplitude as a function of residual fringe frequency (a) and delay (b). The one- 
dimensional plots are the peak fringe amplitude vs. delay and fringe frequency. The probability 
distribution of the noise in these plots is given by Eq. (9.71) and the bias level by Eq. (9.72). 
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consider an experiment in which Av = 50 MHz at an observing frequency of 
10!! Hz. The delay increments are equal to the sampling interval of 0.01 us. An 
instrumental delay uncertainty of +1 js requires a search of 200 delay intervals. 
If the coherent integration time is 200 s and the frequency standards are set only to 
a fractional accuracy of 107!!, then +1 Hz must be searched, which at an interval 
size of 0.005 Hz is 400 discrete frequencies. The total number of cells to be searched 
is 80,000. If there is no signal present, then p(Z) will be given by Eq. (9.53). The 
cumulative probability distribution (that is, the probability that Z is less than Zo) in 
this case is the integral of Eq. (9.53) from zero to Zo, or 


2 
P(Zo) = 1 — exp (-£) : (9.68) 


The cumulative probability distribution for the maximum of n independent samples 
Zm = max {Z1, Zo,...,Zn} is 


2 n 


Thus, the probability of one or more samples exceeding Z,,, which we call the 
probability of error, pe, is 


Z\)" 
¿=1=|1= Lem . 9.70 
p | exp ( =) (9.10) 


This function is shown in Fig. 9.8. The probability distribution of Zm is obtained by 
differentiating Eq. (9.69), 


nZ z2 zy 
P(Zm) = -7 P E 1 — exp = ; (9.71) 


For large n, this probability distribution is nearly Gaussian, with mean value and 
standard deviation given by 


(Zn) ~ o/2Inn , (9.72) 
0.77 
Fe peel (9.73) 


Vinn f 


Examples of p(Z,,) for various values of n are shown in Fig. 9.9. It is frequently 
useful to reduce a two-dimensional function, such as the one shown in Fig. 9.7 of 
fringe amplitude vs. fringe frequency and delay, to a one-dimensional function by 
searching for the maximum value of the function over one variable. This search 
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Fig. 9.8 Probability that one or more samples of the fringe amplitude will exceed the value Z,,/o 
in the absence of a signal, as given by Eq. (9.70). The curves are labeled by the number of samples 
measured. 


Zm/o 


Fig. 9.9 Probability distribution of the maximum of n random variables that have Rayleigh 
distributions, as given Eq. (9.71). 
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process introduces a bias, equal to (Z,,), into the one-dimensional function. This 
bias increases with the number of samples and obscures weak signals. 

We can also calculate the probability of misidentifying a signal. Suppose that we 
have measurements of fringe amplitude at two values of delay or fringe frequency 
with the signal present at one value. The probability that the amplitude in the channel 
with the signal (Z,) is larger than the amplitude in the channel with only the noise 
(Z2) is 


foe) Z 
p(Z > Zs) = J pæ | f pZ»dz| dZ; . (9.74) 


p(Z,) is given by Eq. (9.45), and p(Z2) is given by Eq. (9.53). We can generalize 
this result for a search over n channels where the signal channel amplitude is Z,. 
The probability that Z, will exceed the values of Z in the other channels is, from 
Eqs. (9.68) and (9.74), 


o z2 n—1 
D(Z; > Z,...,Z,) = f p(Z) [i — exp (- ) dZ , (9.75) 
0 


202 


where p(Z) is given by Eq. (9.45). Thus, the probability of one or more samples 
exceeding the amplitude of the signal is 


; oo z2 n—1 
p,=1 -f p(Z) — exp (-=)| dZ. (9.76) 
0 


pl, is plotted in Fig.9.10. For example, if the search is over 100 channels, a 
probability of misidentification of less than 0.1% requires |V|/o > 6.5. 


9.3.5 Coherent and Incoherent Averaging 


We wish to estimate the amplitude of a barely detectable signal. We examine a time 
series of correlator output values in which the phase, ¢(t), represents the effects 
of receiver noise, fluctuations in the frequency standards, or fluctuations in the 
atmospheric path. An example of phase vs. time from a VLBI measurement is shown 
in Fig. 9.11. The correlator output is 


r(t) = Z(te? . (9.77) 


How do we estimate |V| when the time range of the data exceeds the coherence 
time? There are two useful procedures, the first in the spectral domain and the 
second in the time domain. Suppose that r(t) is sampled at intervals short with 
respect to the coherence time, Te, thus generating a time series of samples r„. The 
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Fig. 9.10 Probability that one or more samples of fringe amplitude among the samples with no 
signal will exceed the fringe amplitude of the sample with the signal, vs. the signal amplitude, 
IV], as given in Eq. (9.76). The curves are labeled according to the total number of samples n. The 
asymptotic value of p’, as |V|/o goes to zero is 1 — 1/n. 


discrete Fourier transform (see Appendix 8.4) of r, is 


N-1 
R, = 5 fp ePN (9.78) 


n=0 


where Rx is the N-point discrete fringe rate spectrum ranging in frequency from 
—N/2t, to N/2t,. Hence, from Parseval’s theorem [Eq. (8.179)], 


N-1 1 N-1 
a? = = DIR? (9.79) 
0 k=0 


n= 


Using Eq. (9.50), we can write an unbiased estimator of |V]*, valid for large N, as 


N-1 
1 
Ivi = (=X IR? | — 20° . (9.80) 
k=1 


When the total span of the data exceeds the coherence time of the interferometer, 
the fringe rate spectrum becomes complicated, but Eq. (9.80) provides a prescription 
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Fig. 9.11 Fringe phase vs. time from an observation of a strong source [the water vapor maser 
in W3 (OH)] on a three-baseline VLBI experiment at 22 GHz. Two of the stations, Haystack 
Observatory and the Naval Research Laboratory (Maryland Point Observatory), were equipped 
with hydrogen maser frequency standards, while the National Radio Astronomy Observatory 
used a rubidium (vapor frequency standard). The phase noise in the top plot is dominated by 
contributions from the receivers and the atmosphere, while the phase noise in the bottom two plots 
is dominated by the phase noise in the rubidium frequency standard. These data were obtained in 
1971 with the Mark I VLBI system. 


for gathering all of its frequency components into an unbiased estimate of |V|?. See 
Clark (1968) and Clark et al. (1968) for applications of this method. 

The second method for estimating |V|?, based on the time series, comes directly 
from Eq. (9.50), 


N 
1 
Iv? = we =e"; (9.81) 


i=1 


Imaging or model analysis is usually based on estimates of |V], not |V|?. To obtain 
an unbiased estimate of |V|, we first examine the properties of the quantity 
T 1/2 
Yveb=| >Y z] . 9.82 
eS | 5, 7 (9.82) 


i=1 
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Recall that 
Z = (Vita)? He, (9.83) 


where €,, and €,, are Gaussian random variables with zero mean and variance 0°. 
Equation (9.82) becomes 


1/2 


IY] = |V| (9.84) 


2er e2 + Z 
1 l M ah ECA 
TN ID [e we 


We assume that the terms in the brackets are « 1 and then expand Eq. (9.84) to 
second order, which is necessary to retain all the second-order terms involving €,,. 
Then the expectation of |V|, becomes 


o? 1 


which leads directly to an unbiased estimate of |V| of 


N 1/2 
l 7—0’ (2 : (9.86) 
Vk =S Z — — — ‘ : 
Iv] N27 (2-5 


Equation (9.86) is accurate to < 5% for V/o > 2 and N = 1, and‘V/o > 0.3 and 
N = 100. This estimator has several interesting properties. For N > 1, it leads to 
the result suggested by Eq. (9.81). However, for N = 1 and Z; = Z, it leads to the 
result 


w= r A (9.87) 


Equation (9.87) is used to determine the polarized flux from single measurements 
of Stokes Q and U [see Wardle and Kronberg (1974)]. For one measurement of 
Z, |V]e in Eq. (9.87) is a good approximation for the most likely value of |V| 
given p(Z) defined in Eq. (9.45). See Johnson et al. (2015) for further discussion 
and applications. 

From Eqs. (9.50), (9.51), and (9.81), we have (|V|2) = |V|? and (V$) = 
|V| + 407(|V|* + 07)/N, so that the signal-to-noise ratio is 


7 (IV) a YN yp l 
VI+|VP/o? 


"IVY (VE? 20? 
|V|/o is equal to the signal-to-noise ratio at the output of a single-multiplier 
correlator, as given by Eqs. (6.49) and (6.50). For VLBI observations, the quanti- 
zation efficiency described in Sect. 8.3, ng, is replaced by the general loss factor n, 
described in Sect. 9.7, and from Eq. (6.64), we obtain |V|/o = (Tan/Ts)./2Avte. 


(9.88) 
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Equation (9.88) then becomes 


T2172 Av2tte 
Ra = AD ae (9.89) 
Ts \ (+2747? Avt,/TS) 


where t = Nte is the total integrating time. The two limiting cases of Eq. (9.89) are 


n Ta Ts 
Ron X — NV AIT, Tr, 9.90 
FAR” N Se (9.90) 
Tany Ts 
Ron X | —) AviW/th , L< —. 9.91 
( Ts ) AN VIA a 


Note that in the strong-signal case, incoherent averaging is not needed. When 
incoherent averaging is used, the coherent averaging time should be as long as 
possible without decreasing the fringe amplitude. If we assume that Rin = 4 for 
detection, and recall that t = Nte, then for the weak-signal case, the minimum 
detectable antenna temperature can be found from Eq. (9.91) to be 


2T; 
NNAS AVT — 


(Ta)min = (9.92) 


Thus, because of the N!/4 dependence in Eq. (9.92), incoherent averaging is 
effective only if N is very large. If the coherence time is of the order of 1/ Av, then 
the observing system reduces to a form of incoherent, or intensity, interferometer 
[see Sect. 17.1 and Clark (1968)]. For the weak-signal case, Eq. (9.91) then becomes 


T. 2 
Ren (=) Avr. (9.93) 


9.4 Fringe Fitting for a Multielement Array 


9.4.1 Global Fringe Fitting 


In Sect. 9.3, we considered the problem of searching for fringes in the output from 
a single baseline. For VLBI, the basic requirement in fringe fitting is to determine 
the fringe phase (i.e., the phase of the visibility) and the rate of change of the fringe 
phase, with time and with frequency or delay. Fringe rate offsets result from errors 
in the positions of the source or antennas as well as antenna-related effects such as 
frequency offsets in local oscillators. Most of these can be specified as factors that 
relate to individual antennas, rather than to baselines. Because of this, data from all 
baselines can be used simultaneously to determine the fringe rate parameters. By 
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simultaneously using all of the data from a multielement VLBI array, it is possible 
to detect fringes that are too weak to be seen on a single baseline. This is particularly 
important for VLBI arrays with similar antennas and receivers; with an ad hoc array, 
a possible alternative is to use the data from the two most sensitive antennas to find 
the fringes and let this result constrain the solutions for other baselines. 

A method of analysis that is based on simultaneous use of the complete data 
set from a multiantenna observation was developed by Schwab and Cotton (1983) 
and is referred to as global fringe fitting. Let Zmn(t) be the correlator output, that 
is, the measured visibility, from the baseline for antennas m and n. The complex 
(voltage) gain for antenna n and the associated receiving system is gn(tk, ve), where 
tk represents a (coherently) time-integrated sample of the correlator output for 
frequency channel vz. Thus, 


Zinn (tk, ve) = mth, veg; (tk, ve) V mn (tk, ve) + Emnke , (9.94) 


where V,nn is the true visibility for baseline mn, and €mnge represents the observa- 
tional errors that result principally from noise. It should be remembered that the 
noise terms are present in all the measurements, but beyond this point, they will 
usually be omitted from the equations. The gain terms can be written as 


Bn(te, Ve) = |gnletvnero (9.95) 


To simplify the situation in Eq. (9.95), we assume that the gain terms and the 
amplitude of the source visibility are constant over the range of (f, v) space covered 
by the observation. To first order, we can then write 


Zin (tks Ve) = |8m|[nllV| exp [iWin — Wn) (to, Vo)] 


x eX l $ (e m Wn + Omn) 
P| J | ma? 


PP (tk — to) (9.96) 


(to,vo) 


(ve — »)] ; 
(to,Vo) 


where mm is the phase of the (true) visibility V,,,,. The rates of change of the phase 
of the measured visibility with respect to time and frequency are the fringe rate 


+ (Win ~~ Wn F Pmn) 
ðv 


P - Wn = Wn + Pn) (9 97) 
ot (tovo) 
and the delay 
T = Win = Wn + Omn) (9 98) 
mn o | ae R , . 
0.0) 
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for the baseline mn at time and frequency (fo, vo). In terms of these quantities, we 
can relate the measured visibility (correlator output) to the true visibility as follows: 


Zn (th, ve) = |En] 12n|Vmn (tk, ve) exp [j [in = Wr) li=to 
(9.99) 


+ (Fm = a)(i — t0) + En = Tae = vo) |t 


For each antenna, there are four unknown parameters: the modulus of the gain, the 
phase of the gain, the fringe rate, and the delay. Since all of the data are in the form 
of relative phases of two antennas, it is necessary to designate one antenna as the 
reference. For this antenna, the phase, fringe rate, and delay are usually taken to be 
zero, leaving 4na—3 parameters to be determined. However, it is possible to simplify 
further and consider only the phase terms in the fringe fitting. The amplitudes of the 
antenna gains are subsequently calibrated separately. The number of parameters to 
be determined is thereby reduced to 3(na — 1). Then to obtain the global fringe 
solution, the source visibility V,,, is represented by a model of the source, and a 
least-mean-squares fit of the parameters in Eq. (9.99) to the visibility measurements 
is made. For details on a method for the least-mean-squares solution, see Schwab 
and Cotton (1983). The source model, which is a “first guess” of the true structure, 
could in some cases be as simple as a point source. 

Another method of using the data for several baselines simultaneously in fringe 
fitting is an extension of the method described earlier for single baselines. The 
measured visibility data are required to be specified in terms of fringe frequency 
and delay, which can be obtained, for example, by a time-to-frequency Fourier 
transformation of the data from a lag correlator. Then for each antenna pair, there is 
a matrix of values of the interferometer response at incremental steps in the delay 
and fringe rate. The maximum amplitude indicates the solution for delay and fringe 
rate for the corresponding baseline, as illustrated in Fig. 9.7. However, the method 
can be extended to include the responses from a number of baselines by using the 
closure phase principle, which is discussed in more detail in Sect. 10.3. Because we 
are considering fringe fitting in phase only, the measured data are represented by 
Pmn- Since Wn, the instrumental phase for baseline mn, is equal to the difference 
between the measured and true visibility phases, we can write 


Winn = Win = Wn = bmn E mn , (9.100) 


where the w terms represent the instrumental phases, the @ terms represent the 
visibility phases, and the tilde (~) indicates measured visibility phases. Now 
consider including a third antenna, designated p. For this combination, we can write 


Winpn = Winp + Won = (Wn = Wp) + (Wp — Wn) = Wn = Wn . (9.101) 


Thus, Ympn provides another measured value of Ymn, equal to 


Ymp + Yon = (bmp — Pmp) + (Pon — Opn) - (9.102) 
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Similarly, for four antennas 


Winpgn = Wn = Wn = (Pmp + Ppa + Pan) = (Omp + Ppa + an) : (9.103) 


Thus, estimated values of Ymn can be obtained from the measurements from loops 
of antenna pairs, starting with antenna m and ending with antenna n. Combinations 
of more than three baselines (four antennas) can be expressed as combinations 
of smaller numbers of antennas, and the noise in such larger combinations is not 
independent. Loops of three and four antennas provide additional information that 
contributes to the sensitivity and accuracy of the fringe fitting for antennas m and n. 
Note, however, that the model visibilities are also required. 

Of the two techniques, the least-mean-squares fitting is better with respect to 
uniform combination of the data, but it requires a good starting estimate if it is to 
converge efficiently. Schwab and Cotton (1983) used the second of the two methods 
to provide a starting point for the full least-mean-squares solution. This procedure 
has subsequently become the basis of standard reduction programs for VLBI data 
(Walker 1989a,b). 

Although global fringe fitting provides sensitivity superior to that of baseline- 
based fitting, in practice, some experience is needed to determine when use of the 
global method is appropriate. If the source under study has complicated structure, 
with large variations in the visibility amplitude, it will probably not be well repre- 
sented by the model visibility required in the global fitting method. In such a case, it 
may be better to start with a smaller number of antennas in the fringe fitting or, if the 
source is sufficiently strong, to consider baselines separately. On the other hand, if 
the source contains a strong unresolved component, it may be adequate to consider 
smaller groups of antennas separately and thus reduce the overall computing load. 


9.4.2 Relative Performance of Fringe Detection Methods 


In the regime in which the phase noise limits the sensitivity, careful investigation of 
detection techniques is warranted. The most important of these have been examined 
by Rogers et al. (1995) to determine their relative performance. We assume in all 
cases that the visibility data from the correlator outputs have been averaged for a 
time equal to the coherence time, Te, discussed earlier. We have seen in Eq. (9.92) 
that incoherent averaging of N time segments of data reduces the level at which a 
signal is detectable by an amount proportional to N~!/*. Rogers et al. show that for 
a detection threshold for which the probability of a false detection is < 0.01% ina 
search of 10° values, the threshold of detection is lower than that without incoherent 
averaging (in effect, N = 1) by a factor 0.53N~'/+. This result is accurate only 
for large N, and they find empirically that for smaller N, the detection threshold 
decreases in proportion to N~°°°; that is, the improvement with increasing N is 
greater when N is small. Table 9.2 includes the improvement factor 0.53N~'/*, 
together with other results that are discussed below. The fourth column of Table 9.2 
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Table 9.2 Relative thresholds for various detection methods* 


Method Threshold (relative flux density) 
1 One baseline, coherent averaging 1 1 
2 One baseline, incoherent averag- 0.53N—!/4 0.14 (N = 200) 

ing 
3 3-baseline triple product (4) ue 0.52 (N = 200) 

1/2 

4 Array of na elements, coherent (2) 0.45 (ng = 10) 

global search 

1/4 

5 Global search with incoherent 0.53( x47) 0.05 (N = 200, ng = 10) 

averaging 

1/4 

6 Incoherent averaging over 0.53( z) 0.05 (N = 200, na = 10) 


time segments and baselines 


From Rogers et al. (1995). © AAS. Used by permission. 
“See text for detection criterion. 


gives numerical examples of relative sensitivity for N = 200 time segments and 
Nna = 10 antennas. Note that for lines 1—5 of Table 9.2, the criterion for detection is 
a probability of error of less than 1% in a search of 10° values of delay and fringe 
rate for each of n,—1 elements of the array, the values for the reference antenna taken 
to be zero. For line 6, the search spans only the two dimensions of right ascension 
and declination. 


9.4.3 Triple Product, or Bispectrum 


Another form of the output of a multielement array that can be considered is the 
triple product, or bispectrum, which is the product of the complex outputs for three 
baselines that form a triangle. The triple product is given by the product of measured 
visibilities 


P3 = [Zy2||Zp3||Z [eM P2+934 8) = |Z9||Zos||Zarlel* , (9.104) 


where Øc represents the closure phase (Sect. 10.3), which is zero if the source is 
unresolved. We assume here that the amplitude of the measured visibility, Z, is 
calibrated separately, so that the moduli of the gain factors gm and g, in Eq. (9.94) 
are unity. Each of the measured visibility terms includes noise of power 207, that is, 
the noise power in the output of a complex correlator. For the weak-signal case, the 
noise determines the variance of the triple product, which is 


(\P3|?) = (|Zi2|?|Zo3|7|Za1|7) = 80° . (9.105) 
For a point source, the signal is real and is equal to ((ReP3)*) = (|P3|?)/2, where 


Re indicates the real part. The ratio of this triple product signal term to the noise 
in the real output of the correlator is V?/207. Rogers et al. (1995) also give an 
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expression for the signal-to-noise ratio that is not restricted to the weak-signal case, 
and Kulkarni (1989) gives a general expression in a detailed analysis of the subject. 

Now consider the incoherent average of N values of the triple product for three 
antennas, each of which represents an average of the correlator output over the 
coherence interval, te. We represent this average of triple products by 


= 1 : 
= X $c 
P3 = N — |Zi2l|Z23||Z31 |e? ‘i (9.106) 


If the signal amplitudes are equal, the expectation of the real part of P3 is 
(ReP3) = V°, (9.107) 


and the second moment of ReP3 is 
= 1 
((ReP3)’) = g (lP cos de) . (9.108) 


In the weak-signal case, in which the value of (|P3|”) results mainly from noise, the 
expectation of the second moment is, from Eq. (9.105), 40°/N. The signal-to-noise 
ratio is equal to the expectation of P; divided by the square root of the expectation 
of the second moment, 


= ——_., (9.109) 
from which 
V = (Ren) PONTE (9.110) 


Line 3 of Table 9.2 gives the signal strength for a value of Rs, that allows detection 
at a level corresponding to the specified error criterion. 


9.4.4 Fringe Searching with a Multielement Array 


With an array of na VLBI antennas, the amount of information gathered in a 
given time is greater than that with a single antenna pair by a factor na(na — 1)/2. 
One might thus expect that the array would offer an increase in sensitivity 
~ [nalna — 1)/2]!/2. However, the larger number of antennas also introduces a 
very large increase in the parameter space to be searched. Thus, the probability of 
encountering high noise amplitudes within this parameter space is correspondingly 
greater. It is therefore necessary to increase the signal level used as a detection 
threshold in order to avoid increasing the probability of false detection. 
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Consider a two-element array in which the number of data points to be searched 
in the parameter space (frequency x delay) is ng. If a third antenna is then 
introduced, and correlation is measured for all baselines, the number of data points 
to be searched becomes 73. For na antennas, it becomes ae The probability 
distribution of the maximum of n Rayleigh-distributed values of the signal plus 
noise, Zm, is given in Eq. (9.71) and for large n has a mean value of o(21n n)'/?, 
see Eq. (9.72). Thus, for a given probability of occurrence, increasing the number 
of points to be searched from ng to nD increases the level Z,, from o (21n na)! 
to o[2 (na — 1)lnna]!/?; that is, the probability of finding a level (na — 1)'/?Z, in 
a search of ae? points is the same as that of finding a level Z,, in a search 
of ng points. By increasing the number of antennas from 2 to ng, the overall rms 
uncertainty in the signal level is reduced by a factor [ng(ng — 1)/2]!/?, but since the 
detection threshold has increased by (na — 1)'/”, the effective gain in sensitivity for 
detection of sources is increased by only (nq/2)!/*. Rogers (1991) and Rogers et al. 
(1995) consider other factors in deriving this result and show that the sensitivity 
increase (n,/2)'/? should be multiplied by a factor that lies between 0.94 and 1. 


This factor is not included in Table 9.2. 


9.4.5 Multielement Array with Incoherent Averaging 


In Table 9.2, the last two lines are concerned with incoherent averaging of data 
taken with a multielement array. The method on line 5 involves data that have 
been averaged over the coherence time and subsequently averaged incoherently 
before the application of a global fringe search. The relative threshold value is the 
product of the threshold on line 4 for a multielement global search with that on 
line 2 for incoherent averaging over a single baseline. The method in line 6 involves 
incoherent averaging over both time segments (equal to the coherence time) and 
baselines. The relative threshold is obtained from that in line 2 by increasing the 
number of data from N (the number of time segments per baseline) to N multiplied 
by the number of baselines. 


9.5 Phase Stability and Atomic Frequency Standards 


Precision oscillators have been steadily improved since the 1920s, when the 
invention of the crystal-controlled (quartz) oscillator had immediate application to 
the problem of precise timekeeping. In the early 1950s, cesium-beam clocks allowed 
better timekeeping than could be obtained from astronomical observations. This 
development led to an atomic definition of time that differs from the astronomical 
one, and to the establishment of the definition of the second of time based on a 
particular transition frequency of cesium. 
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The mathematical theory of the interpretation of measurements of oscillator 
phase was systematized by an IEEE committee (Barnes et al. 1971). This paper 
helped standardize the approach to handling low-frequency divergence in the noise 
of oscillators. The physical theory of noise in oscillators was treated by Edson 
(1960). In this section, we develop relevant aspects of the theory and describe the 
operation of atomic frequency standards, with particular emphasis on the hydrogen 
maser. The theory and analysis of phase fluctuations are discussed in more detail by 
Blair (1974) and Rutman (1978). 


9.5.1 Analysis of Phase Fluctuations 


The desired signal from an oscillator is a pure sine wave: 
V(t) = Vo cos 2r vot . (9.111) 


This is unobtainable since all devices have some phase noise. A more realistic model 
is given by 


V(t) = Vocos [2r vot + 4O] , (9.112) 


where ¢ (t) is a random process characterizing the phase departure from a pure sine 
wave. We ignore amplitude fluctuations since they do not directly affect perfor- 
mance in VLBI applications. The instantaneous frequency v(t) is the derivative of 
the argument of Eq. (9.112) divided by 27, that is, 


v(t) = vo + v(t), (9.113) 
where 
_ 1 ago) 
v(t) = of uae (9.114) 


The instantaneous fractional frequency deviation is defined as 


_ oO _ 1 dg) 


t = 9.115 
yo Vo 27rvo dt ( ) 


This definition allows the performance of oscillators at different frequencies to be 
compared. We assume that the random processes ¢(t) and y(t) are statistically 
stationary, so that correlation functions can be defined. This assumption is not 
always valid and can cause difficulty (Rutman 1978). The autocorrelation function 
of y(t) is 
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Ry(t) = (y y(t + 7)) . (9.116) 


R,(t) is a real and even function, so S (f), the power spectrum of y(t), is a real 
and even function of frequency f. In order to prevent confusion between v(t) and 
its frequency components, we use the symbol f for the frequency variable in the 
following spectral analysis. Following the somewhat nonstandard convention that is 
used in most of the literature on phase stability (Barnes et al. 1971), we replace the 
double-sided spectrum S, (f) with a single-sided spectrum S,(f), where S,(f) = 
2S\(f) forf > 0, and S\(f) = 0 forf < 0. Since S; (f) is even, no information 
is lost in this procedure. Thus, the Fourier transform relation R,(t) <—> S! (f), can 
also be written as 


S\(f) = af R,(t)cos(2xfr)dīt , 


5 (9.117) 
Ry(t) = f S, (f) cos(2xft) df . 
0 
Similarly, the autocorrelation function of the phase is 
Ro(t) = (POU + 1) . (9.118) 


S¢(f), the power spectrum of ¢, and Rg(t) are related by a Fourier transform. 
From the derivative property of Fourier transforms, the relationship between S,(/f) 
and S (f) can be shown to be 


P 
Sif) = S4C). (9.119) 
0 


S,(f) and Sg(f) serve as primary measures of frequency stability. They both 
have the dimensions of Hz~!. Another commonly used specification of oscillator 
performance is £(f), which is defined as the power in 1-Hz bandwidth at frequency 
f in one sideband of a double-sided spectrum, expressed as a fraction of the total 
power of the oscillator. When the phase deviation is small compared with one radian, 
LP) = So(f)/2. 

A second approach to frequency stability is based on time-domain measurements. 
The average fractional frequency deviation is 


1 tht 
n=-/ y(t) dt, (9.120) 
T Jy 
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PHASE 


TIME 


Fig. 9.12 (a) Time intervals involved in the measurement of y, as defined in Eq. (9.121). (b) Plot 
of a series of phase samples vs. time. The Allan variance, defined in Eq. (9.123), is the average of 
the square of the deviation, (5f)*, of each sample from the mean of its two adjacent samples. 


which, from Eq. (9.115), becomes 


5, =t -ow (9.121) 
2M Vot 


where the measurements of y, are made with a repetition interval T(T > t) such 
that 4+; = tk + T (see Fig. 9.12a). Measurements of y; are directly obtainable with 
conventional frequency counters. The measure of frequency stability is the sample 
variance of y, given by 


1 J T 
2 = = 
o-(N,T,t)) = —— nas ; 9.122 
ror.) = {9 (a-a) ) 0.122) 
where N is the number of samples in a single estimate of o$. In the limit as 
N —> œ, the quantity presented above is the true variance, which we represent 
as I’ (t). However, in many cases, Eq. (9.122) does not converge because of the 
low-frequency behavior of S,(f), and I(t) is then not defined. To avoid some of 
the convergence problems, a particular case of Eq. (9.122), the two-sample or Allan 
variance, o? (T), has gained wide acceptance (Allan 1966). The Allan variance, for 
which T = t (no dead time between measurements) and N = 2, is defined as 
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follows: 


(Orti I) , 


ogi) = EE 


(9.123) 
or, from Eq. (9.121): 


(Ip @ + 21) — 29(t + 1) + OCP) 


2 
o-(t) = 

em) 
BT VET 


y 


: (9.124) 


The procedure for estimating the Allan variance can be understood as follows. Take 
a series of phase measurements at interval T, as shown in Fig. 9.12b. For each set of 
three independent points, draw a straight line between the outer two and determine 
the deviation of the center point from the line. With m samples of y, the average of 
the squared deviations divided by (27 vot)? is an estimate of OG); denoted a, (t), 
where 


m—l1 
1 


X Or- - (9.125) 


olt) = ———~ 
2(m— 1) rae 


The accuracy of this estimate is (Lesage and Audoin 1979) 


K 
vm? ‘i (9.126) 


where K is a constant of order unity, whose exact value depends on the power 
spectrum of y. 

We can now relate the true variance and the Allan variance to the power spectrum 
of y or ġ. From Eq. (9.121), the true variance is Z(t) = (yz), given by 


O (Oye) > 


P(t) = Gans? [(7(t + t)) — 2(P(t+ tr) O(N) + (°C) (9.127) 


which, from Eq. (9.118), is 


P(t) = 5 


SoS [Rg (0) — Rọ (1)] « (9.128) 


Then, since Rọ (T) is the Fourier transform of Sg(f), by using Eq. (9.119), we obtain 
from Eq. (9.128) the result 


oo : 2 
ro=f so (535) df . (9.129) 
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Similarly, from Eq. (9.124), we obtain 


a, (t) = Cue [3R (0) — 4Rg(t) + Rg(2t)] , (9.130) 
and therefore, 
ies me sint aft 
o (t) = af S (f) Ea df . (9.131) 


P(t) and oy (t) are dimensionless quantities, measured in rad’, but we can think 
of them as the power obtained after filtering y(t) with two different frequency 
responses, H? (f) and H} (f), respectively. These are 


F sin aft ? 
(f) = ( = ) (9.132) 
and 
2 _ 2 sin’ xft 
Ay(f) = “GHD? i (9.133) 


The functions H? (f) and H4 (f) and the corresponding impulse responses h;(t) and 
ha(t) are shown in Fig.9.13. Note that 7? (t) can be estimated from a series of 
measurements y, as the average of the square of h;(t,) * ¥,, where the asterisk 
indicates convolution. Similarly, 0, (t) can be estimated as the average of the 
square of ha(t,) * ¥,. Other transfer functions could be chosen. In time-domain 
measurements, additional filtering with high- and low-frequency cutoffs can be 
performed. For example, removing a long-term trend from the frequency data is 
a form of highpass filtering. Clearly, measurements of S,(f) are preferable to those 
of 0, (tT), because o? can be calculated from S, using Eq. (9.131), but S, cannot be 
calculated from Ce However, in many cases of interest, as in the power-law spectra 
discussed below, the form of o? is indicative of the behavior of S,. Traditionally, it 
has been easier to make time-domain measurements, and most published results are 
given in terms of the Allan variance a. 

The effect of local oscillator noise on the measured coherence of signals received 
at two antennas is given by Eq. (7.34) in terms of the rms deviation of the phase 
of the oscillator at one antenna relative to that at the other. For VLBI, this rms 
deviation is equal to the square root of the sum of the true variances of the local 
oscillators at the two antennas. In the case of a connected-element array, low- 
frequency components of the phase noise of the master oscillator cause similar 
effects in the local oscillator phase at each antenna, and therefore their contributions 
to the relative phase at different antennas tend to cancel. For exact cancellation, the 
time delay in the path of the reference signal from the master oscillator to each 
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Fig. 9.13 (top) The impulse function hz(f) and the square of its Fourier transform |H,(f)|*, given 
by Eq. (9.132), which is used to relate the power spectrum S,(f) to the true variance P(t), 
as defined in Eq. (9.129). (bottom) The impulse response f(t) and the square of its Fourier 
transform, |Ha(f)|?, given by Eq. (9.133), which is used to relate the power spectrum S,(f) to 
the Allan variance o?(r), as defined in Eq. (9.131). Note that the sensitivity of the Allan variance 
decreases rapidly with decreasing frequency for f < 0.3/t. 


antenna, plus the time delay of the IF signal from the corresponding mixer to the 
correlator input (including the variable delay that compensates for the geometric 
delay), should be equal for each antenna. It is generally impractical to preserve this 
equality. The bandwidths of phase-locked loops in the local oscillator signals at the 
antennas can also limit the frequency range over which phase noise in the master 
oscillator is canceled. In practice, cancellation of phase noise resulting from the 
master oscillator should generally be effective up to a frequency f in the range of a 
few hundred hertz to a few hundred kilohertz, depending on the parameters of the 
particular system. 

Laboratory measurements show that S,(f) is often a combination of power-law 
components. A useful model, shown in Fig. 9.14, is 


2 


S= X bf. O<f<fa, (9.134) 


a=—2 
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Fig. 9.14 (a) The idealized power spectrum S,(f) of the fractional frequency deviation y(t) [see 
Eq. (9.134)]. The various spectral regimes are marked by Roman numerals, and the power-law 
coefficients are given in parentheses. The regimes are I, white-phase noise; II, flicker-phase noise; 
Ill, white-frequency noise; IV, flicker-frequency noise; and V, random-walk-of-frequency noise. 
(b) Two-point rms deviation, or Allan standard deviation, vs. the time between samples. The 
spectral regimes are marked by the Roman numerals, and the power-law coefficients are given 
in parentheses. 


where g is a power-law exponent with integer values between —2 and 2, and fh is the 
cutoff frequency of a lowpass filter. An equation similar to Eq. (9.134) can be written 
for Sg(f) using Eq. (9.119). Each term in Eq. (9.134) or the equivalent equation 
for Sy(f) has a name based on traditional terminology (see Table 9.3). Noise 
with a power-law dependence f °, independent of frequency, is called “white-phase 
noise”; f~! is called “flicker-phase noise,” or colloquially, “one-over-f noise”; and 
f> is called “random-walk noise.” There are well-known origins for some of 
these processes, which we discuss briefly [see also Vessot (1976)]. The frequency 
dependence given in parentheses below is for Sy. 
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Table 9.3 Characteristics of noise in oscillators* 


Noise type SO Sof) HO b PO 
: : 3hy f; hy fi 
c 2 2 2Sh _ 2Sh 
White phase hf vh anit 2 se 
: 2p p—i 3hy 
Flicker phase hf voli f ree In(2zf;,T) ~-2 - 
White frequency h h 
or random walk ho veho f? ae -1 ar 
of phase 
Flicker frequency hi fo vha F` (21n2)h_, 0 - 
2 
Random walk of haf”? vah_of 4 2m T h—z 1 - 


frequency 


a Adapted from Barnes et al. (1971). 


>Power-law exponent of Allan variance: o? (t) X tH. 


“For 0, (t), 2nfnt > 1. 
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. White-phase noise (f?) is usually due to additive noise outside the oscillator, for 
example, noise introduced by amplifiers. This process dominates at large values 
of f, corresponding to short averaging times. 

. Flicker-phase noise (f!) is seen in transistors and may be due to diffusion 
processes across junctions. 

. White-frequency or random-walk-of-phase noise (f°) is due to internal additive 
noise within the oscillator, such as the thermal noise inside the resonant cavity. 
Shot noise also has this spectral dependence. 

. Flicker-frequency noise (f7!) and random-walk-of-frequency noise (f7?) are 
the processes that limit the long-term stability of oscillators. They are due to 
random changes in temperature, pressure, and magnetic field in the oscillator 
environment. This noise is associated with long-term drift. There is a large 
body of literature on flicker-frequency noise, which is encountered in many 
situations [see Keshner (1982) for a general discussion, Dutta and Horn (1981) 
for applications in solid-state physics, and Press (1978) for applications in 
astrophysics]. 


The variances /?(t) and o? (T) can be calculated for the various types of noise 
described above. For œ = | and 2, the variances converge only if a high-frequency 


cutoff fn is specified. With this restriction, o? converges for all cases. I? (t) converges 


only for æ > 0. These functions are listed in Table 9.3. Except for the logarithmic 
dependence in flicker-phase noise, each noise component maps into a component of 
Allan variance of the form t”. From Table 9.3, we can write the total Allan variance 


o; (1) E [K2 + K? n(2afat)]e + KT! + KZ, + K?,t ; (9.135) 
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where the K values are constants. The subscripts correspond to the subscripts of 
h (see Table 9.3). White-phase and flicker-phase noise both result in y ~ —2, 
but these two processes can be distinguished by varying fn. Note that for 
white-phase and white-frequency noise, the following relations hold [see 
Eqs. (9.129) and (9.131)]: 


3 
oy (t) = ro a=2, (9.136) 
o (t) = P(t), a=0. (9.137) 
In general, when 7° (r) is defined, we see from Eqs. (9.128) and (9.130) that 


o2(r) = AP) - PCr) . (9.138) 


9.5.2 Oscillator Coherence Time 


A quantity of special interest in VLBI is the coherence time. The approximate 
coherence time is that time Te for which the rms phase error is | radian: 


Qrvotedy(te) > 1. (9.139) 


Rogers and Moran (1981) calculated a more exact expression for the coherence time 
that they defined in terms of the coherence function 


1, 
C(T) = gi el? dt 
T Jo 


(9.140) 


where ¢(f) is the component of fringe phase of instrumental origin, and T is an 
arbitrary integration time. ¢ (t) includes effects that cause the fringe phase to wander, 
such as atmospheric irregularities and noise in frequency standards. The rms value 
of C(T) is a monotonically decreasing function of time with the range 1-0. The 
coherence time is defined as the value of T for which (C?(T)) drops to some 
specified value, say, 0.5. The mean-squared value of C is 


(CT) = rf fe GO — ol) ]}arar (0.141) 
= T? Jo Jo Py l l 


If @ is a Gaussian random variable, then 


1 ort pr 21, / 
(C2(T)) = =Í f exp | dtdt , (9.142) 
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where o?(t, t’) is the variance ([¢ (A — ¢(t’)]*), which we assume depends only on 
t = f — t. Then from Eq. (9.118), 


o7(t,) =0°(1) 
= (IPÒ — 6(f) 7) = 2[R (0) — Ro()] . (9.143) 


Note that o*(t) is the structure function of phase and is related to I*(t) by 
Eq. (9.128): 


o?(t) = 4a? ve (1) . (9.144) 


The integral in Eq. (9.142) can be simplified by noting that the integrand is constant 
along diagonal lines in (¢, t’) space for which ¢ — t = t. These lines have length 
/2(T — t) so that 


T 2 
(C2(T)) = =| (1 — =) exp || dt. (9.145) 


Thus, from Eqs. (9.129) and (9.144), 


F (el 
(cn) = 7 | (1- $) exp | -22100 f SHUPA |ar, (9.146) 


where H? (f) is defined in Eq. (9.132). Since S,(f) is often not available, it is useful 
to relate (C*(T)) to o? (t). We can solve Eq. (9.138) for J*(t) by series expansion, 
obtaining , 


2P (t) = 0, (T) + 0, (27) + 0, (41) + 0, (81) +e, (9.147) 


provided that the series converges. Therefore, from Eqs. (9.144), (9.145), 
and (9.147), 


T 
(CT) = F (1 - =) exp {—17 vt [o;(t) +0, (2t) +--+ ]} dr. 
i (9.148) 
This integral is readily calculable for the cases where J*(r) is defined. 

We now consider white-phase noise and white-frequency noise, which are 
important processes in frequency standards on short time scales. For the case of 
white-phase noise, o? = K}1t~*, where KŻ = 3h» f;,/4 is the Allan variance in 
1 s (Table 9.3), and the coherence function can be evaluated from Eq. (9.146) or 
Eq. (9.148): 


21,2 2 
4“ vo K; 


(am) = exp (7S 


) = exp(—hz favo) - (9.149) 
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For white-frequency noise, o? = Kjt~', where Kj = ho/2, and we obtain 


2(e- + aT — 1) 


En = 5 


(9.150) 


Here, a = 2x°vK = 17hov2. The limiting cases for white-frequency noise are 


Qn? v2K2T 
(E) =1- =e 2n?vaKeT <1, 
1 
inn, I ve KT > 1. 9.151 
m2veKeT oo ; ) 


The approximate relation for coherence time in Eq. (9.139) corresponds to rms 
values of the coherence function of 0.85 and 0.92 for white-phase noise and white- 
frequency noise, respectively. These calculations assume that one station has a 
perfect frequency standard. In practice, the effective Allan variance is the sum of 
the Allan variances of the two oscillators: 

o =O, +O. (9.152) 
Thus, if two stations have similar standards, the coherence loss is doubled if the 
loss is small. If the short-term stability is dominated by white-phase noise, which 
is usually the case for hydrogen masers, the coherence function is independent of 
time. This means there is a maximum frequency above which a particular standard 
will not be usable for VLBI, regardless of the integration time. This frequency is 
approximately 1/(27 K2) Hz, which for a hydrogen maser is about 1000 GHz. 

In practice, the coherence C(T) is measured at the peak amplitude of the 
correlator output, which varies as a function of fringe frequency. This operation 
is equivalent to removing a constant frequency drift from the phase data and can be 
considered as highpass filtering of the data with a cutoff frequency of 1/7. Modeling 
this operation as the response of a single-pole, highpass filter, one can show that it 
ensures the convergence of Eq. (9.148) for all processes for which the Allan variance 
exponent u < 1. To compare the various representations of frequency stability, we 
show in Figs. 9.15 and 9.16 examples of the performance of a hydrogen maser given 
by the functions 07, S,(f), and (C?(T))"/. 


9.5.3 Precise Frequency Standards 


Precise frequency standards of interest for VLBI include crystal oscillators and 
atomic frequency standards such as rubidium vapor cells, cesium-beam resonators, 
and hydrogen masers (Lewis 1991). Atomic frequency standards incorporate crystal 
oscillators that are phase-locked or frequency-locked to the atomic process, using 


9.5 Phase Stability and Atomic Frequency Standards 437 


(a) 


HYDROGEN MASER VLG-II (1977) 


Sy 
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Fig. 9.15 (a) Power spectrum of the fractional frequency deviation S,(f) for a hydrogen maser 
frequency standard, and (b) the normalized power spectrum of the phase noise voSe (F). Sy(f) and 
S4(/f) are related by Eq. (9.119). For frequencies above 10 Hz, S,(f) approaches the spectrum of 
the crystal oscillator to which the maser is locked, which declines as f—*. Adapted from Vessot 
(1979). 


loops with time constants in the range 0.1—1 s, so that short-term performance 
becomes that of the crystal oscillator. Details of how these loops are implemented 
are given by Vanier et al. (1979). The performance of the crystal oscillator is very 
important because unless it has high spectral purity, the phase-locked loops involved 
in generating the local oscillator signal from the frequency standard will not operate 
properly (Vessot 1976). 

We first consider a frequency standard as a “black box” that puts out a stable 
sinusoid at a convenient frequency such as 5 MHz, or some higher frequency, at 
which the crystal oscillator is locked to the atomic process. The performance of 
various devices is shown in Fig. 9.17. These somewhat idealized plots show that the 
Allan variances of the standards have three regions: short-term noise dominated 
by either white-phase or white-frequency noise; flicker-frequency noise, which 
gives the lowest value of Allan variance and is therefore referred to as the “flicker 
floor’; and finally, for long periods, random-walk-of-frequency noise. Two other 
parameters can be specified, a drift rate and an accuracy. The drift rate is the linear 
change in frequency per unit time interval. Note that if the standard drives a clock, 
then a constant drift rate results in a clock error that accumulates as time squared. 
The accuracy refers to how well the standard can be set to its nominal frequency. 
The performance parameters are summarized in Table 9.4. 
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Fig. 9.16 (a) Allan standard deviation vs. sample time for a hydrogen maser frequency standard. 


Data from Vessot (1979). (b) Coherence ./(C?(T)), defined by Eq. (9.145), for various radio 
frequencies based on two frequency standards with Allan standard deviations given in (a). 
(c) Signal-to-noise ratio, normalized to unity at one second, of the measured visibility vs. 
integration time for various frequencies. In a VLBI system, the coherence and signal-to-noise ratios 
will be further reduced by atmospheric fluctuations. 


Atomic frequency standards are based on the detection of an atomic or molecular 
resonance. There are three parts to any frequency standard [e.g., Kartashoff and 
Barnes (1972)]: particle preparation, particle confinement, and particle interroga- 
tion. Particle preparation involves enhancing the population difference in the desired 
transition. This is necessary for radio transitions in a gas with temperature T, for 
which hv/kT, < 1, so that the level populations are nearly equal. Preparation 
is usually done either by state selection in a beam passing through a magnetic 
or electric field, or by optical pumping. Particle confinement makes it possible to 
obtain narrow resonance lines from long interaction times, since according to the 
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Fig. 9.17 Idealized performance of various frequency standards and other systems. Rubidium 
(1965) = Hewlett-Packard (HP) 5065; cesium (1965) = HP 5061-004; cesium (1984) = NBS 
Laboratory device no. 4; hydrogen (1970) = early Varian/HP hydrogen maser oscillator; hydrogen 
(1984) = hydrogen maser SAO VLG-11; quartz (2013) = crystal oscillator, Oscilloquartz 8607. 
Dots represent performance of the hydrogen maser oscillator by T4 Science, iMaser 3000; CSO 
(2011) = cryogenic sapphire oscillator stabilizer by GPS (Doeleman et al. 2011). Millisecond 
pulsars are very stable clocks, and the data on one of them from Davis et al. (1985) are shown. 
The stability of some of them, i.e., those with small amounts of “red noise,” reaches 107!5 on a 
time scale of ten years (3 x 108 s) [see Verbiest et al. (2009) and Hobbs et al. (2012)]. VLBI data, 
which show the effect of path length stability through the atmosphere in approximately average 
conditions at low elevation sites, are from Rogers and Moran (1981). 


Table 9.4 Typical performance? data on available frequency standards? 


Ko K Drift Fractional 

Kz a07! K_; ao rate® accuracy 
Type (10~!*s) s!/2) (10715) s—!/2) (10715) (1071?) 
H (active) 0.1 0.03 0.4 0.1 <1 1 
Cs = 50 100 3 1 5 
Cs! — 7 40 3 1 2 
Rb — 7 500 300 10? 10? 
Crystal 1 — 500 300 103 — 


*Two-point Allan standard deviation; coefficient defined by Eq. (9.135). 
>Updated from Hellwig (1979). 

“Fractional frequency change per day. 

‘High-performance Cs. 
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Heisenberg (uncertainty) principle, the line width is equal to the reciprocal of the 
interaction time. Particles can be confined in beams or storage cells. Storage cells 
either contain a buffer gas or have specially coated walls so that particle collisions do 
not result in phase changes. Finally, particle interrogation is the process of sensing 
the interaction of particles and radiation fields. Frequency standards can be either 
active or passive. An example of an active standard is a maser oscillator. Passive 
standards require an external radiation field, and transitions are observed by (1) 
absorption, (2) re-emission, (3) detection of particles having made the transition, 
or (4) indirect detection of a quantity such as a variation in the rate of optical 
pumping. To show how some principles are implemented in practice, we give brief 
descriptions of the operation of several types of standards in the next two sections. 

Other types of frequency standards are under development. For a general review 
of types of technology, see Drullinger et al. (1996). The cryogenic sapphire 
oscillator has excellent short-term stability (better than that of the hydrogen maser) 
and may be useful for VLBI at frequencies approaching 1 THz (Doeleman et al. 
2011; Rioja et al. 2012). Other laboratory devices include the laser-cooled mercury 
ion frequency standard (Berkeland et al. 1998) and the ultracold atomic ytterbium 
oscillator, whose stability approaches 107!8 in 7 h (Hinkley et al. 2013). 


9.5.4 Rubidium and Cesium Standards 


Rubidium is an alkali metal with a single valence electron and thus a hydrogenlike 
spectrum. The electronic ground state is split into two levels, with a transition 
frequency of 6835 MHz. These levels correspond to the spin of the unpaired electron 
being parallel or antiparallel to the nuclear spin vector. A schematic diagram of the 
oscillator system is shown in Fig. 9.18. An RF plasma discharge in a tube containing 
87Rb excites the gas to an electronic level about 0.8 um above the ground state. 
The light from this discharge passes through a filter that removes the components 
involving the F = 2 level and passes the light at 0.7948 um. This filter consists 
of a cell of Rb atoms whose energy levels are slightly shifted from those of the 
87Rb atoms, such that both gases have transitions near 0.7800 um. The filtered 
light passes through another cell of ®’Rb gas inside a microwave cavity resonant 
at the transition frequency between the F = 2 and F = 1 levels. With no RF 
signal applied to the cavity, the gas is nearly transparent, and the discharge beam is 
unattenuated as it reaches the photodetector. The application of an RF signal at 6835 
MHz stimulates transitions from the F = 2 to F = 1 level. The atoms reaching the 
lower level are then pumped to the excited state by the light from the filtered ®’Rb 
lamp. The ®’Rb light therefore suffers absorption. A buffer gas, consisting of inert 
atoms that collide elastically with the *’Rb atoms in the resonance cell, extends 
the interaction time to about 107°? s, the mean collision time with the cell walls, 
and gives an absorption resonance with a line width of about 10° Hz. The cavity 
is magnetically shielded to minimize external fields. A weak homogeneous field is 
applied so that only AMp = 0 transitions, which have zero first-order Doppler shift, 
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Fig. 9.18 (a) Schematic diagram of a rubidium gas-cell frequency standard; (b) pump and 
microwave transitions; (c) magnetic sublevels of microwave transition vs. magnetic field; 
(d) absorption of ŝ'Rb light vs. microwave frequency. Adapted from Vessot (1976). 


are obtained. The absorption resonance has a width of 10?—10° Hz. The shot noise 
of individual arriving photons leads to white-frequency noise. 

The radio frequency signal is frequency- or phase-modulated so that the reso- 
nance line is continuously scanned. A control voltage is generated by comparing 
the modulation signal and the detector signal and is fed back to the slave oscillator 
driving the cavity to correct its frequency to the peak of the resonance. 

Rubidium standards have the advantage of being small, inexpensive, and read- 
ily transportable. They are sometimes used in VLBI below 1 GHz, where the 
ionosphere dominates system stability. At higher frequencies, the use of rubidium 
standards results in degraded performance. They are useful as a backup for a primary 
standard and can also be used in OVLBI spacecraft to reduce the uncertainty in the 
timing when the radio link from the ground station is interrupted. 

Cesium, like rubidium, is an alkali metal with a single valence electron. The 
cesium standard is important because it is used to define the standard of atomic time. 
The frequency of the ground-state, spin-flip transition is exactly 9192.631770 MHz, 
by definition of the second of atomic time. A ribbon-shaped beam of cesium gas is 
passed through a state-selector magnet that passes the atoms in the F = 3 level into 
a resonator. Cesium frequency standards are larger and substantially more expensive 
than rubidium standards. Because of their low signal-to-noise ratio, their short-term 
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stability is poor. Thus, they are not used in VLBI for controlling local oscillators. 
However, they provide excellent long-term stability and are used to monitor time. 
They have also been used to verify the capability of transferring time via VLBI 
(Clark et al. 1979). The historical development of the cesium-beam resonator is 
described by Forman (1985). 


9.5.5 Hydrogen Maser Frequency Standard 


The hydrogen maser oscillator is the usual VLBI standard, and we discuss its 
operating principles in some detail.! The quantum mechanical analysis of the hydro- 
gen maser is presented in a classic paper by Kleppner et al. (1962). Fundamental 
principles of masers are given by Shimoda et al. (1956), and details of maser 
construction are given by Kleppner et al. (1965) and Vessot et al. (1976). 

The hydrogen maser oscillator uses the ground-state, spin-flip transition at 
1420.405 MHz, the well-known 21-cm line in radio astronomy. A schematic 
diagram of the oscillator is shown in Fig. 9.19. The hydrogen for the maser comes 
from a tank of molecular hydrogen gas that is dissociated in an RF discharge. The 
gas in the discharge is ionized and emits the reddish glow of the Balmer lines as the 
hydrogen atoms recombine and cascade to the ground state. The atomic gas flows 
out of the dissociator through a hexapole-magnet state selector. The inhomogeneous 
magnetic field separates the two upper states, F = 1, Mp = 1 and F = 1, Mr = 0, 
from the lower states, F = 1, Mp = —1 and F = 0, Mr = 0. The beam of 
atoms in the two upper states is directed into the storage bulb that is located inside 
a microwave cavity resonant in the TE); or TE;;; mode at 1420.405 MHz. The 
atoms bounce around the inside of the bulb about 10° times before escaping through 
the entrance hole. The spent atoms are evacuated from the system, which operates at 
low pressure, by an ion pump. The cavity is surrounded by several layers of material 
with high magnetic permeability that shield it from ambient magnetic fields. Inside 
the shield is a solenoid that creates a weak homogenous field. This field allows 
the (F = 1, Mr = 0)-to-(F = 0, Mr = 0) transition to radiate and minimizes 
transitions from the F = 1, Mr = 1 level. There is no first-order Zeeman effect 
for the AM; = 0 transition (see Fig. 9.19). The maser will oscillate if the cavity is 
tuned close to the transition frequency and the losses are small enough. In the active 
maser, the 1420-MHz signal is picked up by a cavity probe and used to phase-lock 
a crystal oscillator from which a signal at the hydrogen line frequency has been 
synthesized. 


'This section describes the operation of the active hydrogen maser. There is another device called 
a passive hydrogen maser frequency standard, in which the hydrogen in the cavity does not reach 
self-oscillation. This type of standard has about one order of magnitude poorer performance than 
the active hydrogen maser. 
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Fig. 9.19 (a) Schematic diagram of a hydrogen maser frequency standard. The line frequency 
shown is the rest frequency of the transition in free space from Hellwig et al. (1970). The actual 
frequency will differ typically by ~ 0.1 Hz because of cavity pulling, second-order Doppler, and 
the wall shift. (b) Energies of magnetic sublevels vs. magnetic field for the 21-cm transition. 
Adapted from Vessot (1976). (c) Curves of resonance frequency vo vs. cavity frequency vc for 
two values of line width [see Eq. (9.158)]. The intersection of the curves, which can be found 
empirically, gives the best operating frequency. 


The interaction lifetime of an atom in the bulb can be described by an exponential 
probability function 


SO = ye”, (9.153) 


where y is the total relaxation rate. The line has an approximately Lorentzian profile 
with a line width (full width at half-maximum) Avo of y/2. The most important 
contribution to y is the rate at which atoms escape through the entrance hole. This 
rate is 


VoA ph 
= Sa 4 9.154 
y. 6V ( ) 
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where vo = ./8kT,/m is the average particle speed, T, is the gas temperature, m 
is the mass of a hydrogen atom, Ap is the area of the entrance hole, and V is the 
volume of the bulb. ye is about 1 s~!. The atoms lose coherence after many wall 
collisions, and this leads to a loss rate yẹ ~ 1074 s~!. Collisions between hydrogen 
atoms cause spin-exchange relaxation at a rate y,. that is proportional to the gas 
density and to vo. The net relaxation rate is approximately the sum of the three most 
important terms: 


Y = Ye + Yw + Yse = WAV. (9.155) 


All three terms are proportional to vp and thus also to le Note that the random 
thermal motions of the atoms do not give rise to a first-order Doppler broadening of 
the line, because the interaction between the atoms and the RF field takes place in a 
resonant cavity [see Kleppner et al. (1962)]. 

The maser oscillator has two resonant frequencies, the line frequency vz and the 
electromagnetic cavity resonance frequency vc, defined by the cavity’s dimensions. 
In classical oscillators, the frequency is the mean of these two, weighted by the 
respective Q factors, Qz for the line and Qc for the cavity: 


vo = VLQL + vcQc ; (9.156) 
QL + Qc 


The Q factor is defined as x times the reciprocal of the fractional loss in energy per 
cycle of the resonant frequency. Hence, from Eq. (9.153), Qz is given by [see, e.g., 
Siegman (1971)] 


QL ~ — = — . (9.157) 


A typical value of Qz is about 10°. The practical value of Qc for a silver-plated 
cavity is about 5 x 10*. Since Q; >> Qc, the resonance frequency is approximately 


pa“ Gee, (9.158) 
QL 


Equation (9.158) describes the effect of “cavity pulling” on the resonance frequency. 
Temperature changes cause the size, and thus the resonant frequency, of the 
cavity to change. Hence, a fractional frequency stability of 107! for the maser 
requires a fractional mechanical stability of about 5 x 10~!° for the cavity. The 
cavity dimensions therefore must be stable to about 1078 cm. The cavity must be 
made from material with a small thermal expansion coefficient or the temperature 
must be carefully controlled. Extreme mechanical stability is also required so that 
atmospheric pressure changes do not affect the frequency. The TEo1; cavity is a 
cylinder about 27 cm in length and diameter, appreciably larger than the free- 
space wavelength because of the loading by the storage bulb. Coarse tuning is 
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accomplished by moving the end plate of the cavity and fine tuning by a varactor 
diode. From Eq. (9.158), it is clear that the maser frequency is most stable when 
vc is set to vz so that vp equals vz regardless of the values of Qc and Qz. This 
optimal tuning point of the maser can be found by making a plot of vo vs. vc, which 
is a straight line with slope Qc/Qz, according to Eq. (9.158). By varying Qz (for 
example, by varying the gas pressure and thereby changing y), a family of straight 
lines can be generated that intersect at the desired frequency v9 = vg = ve (see 
Fig. 9.19c). Servomechanisms are used in some systems to keep the maser cavity 
continuously tuned. 

The performance of hydrogen masers is shown in Figs. 9.16 and 9.17. For periods 
less than 10° s, the performance is limited by two fundamental processes: white- 
frequency noise due to thermal noise generated inside the cavity and white-phase 
noise due to thermal noise in the external amplifier. The thermal noise generated 
inside the cavity produces a fractional frequency variance (Allan variance) of 


i =—. 3 (9.159) 


where Po is the power delivered by the atoms (Edson 1960; Kleppner et al. 1962). 
There is also shot noise in the cavity due to the discrete radiation of photons. 
However, this process, described by the Allan variance On is smaller than Op by 
the ratio hv /kT,, which is 2 x 1074 at room temperature. Spontaneous emission also 
contributes a small amount of noise, equivalent to increasing T, by hv/k ~ 0.07 K. 
Finally, the maser receiver adds a noise power, kTrAv, to the signal coupled out 
of the cavity, where Tp is the receiver noise temperature and Av is the receiver 


bandwidth. This noise causes an Allan variance of (Cutler and Searle 1966) 


2 1 kTrAv 
Or = SOD 
2 (2m vot)” Po 


(9.160) 


These two processes are independent, so the net Allan variance is o; = Oy + OR: 
The effects of both processes are clearly evident in the data in Fig. 9.17. Note that a 
flicker floor is not reached because of long-term drifts. The short-term performance 
can be improved by increasing the atomic flux level, which increases Po. However, 
increasing the flux increases the spin-exchange rate, which decreases Qz, thereby 
making the oscillator more susceptible to the long-term effects of cavity pulling. 
The frequency of a maser is not exactly equal to the atomic transition frequency 
because of several effects. These effects limit the accuracy to which the frequency 
can be set, and because most of them are temperature dependent, they proba- 
bly contribute to flicker-frequency and random-walk-of-frequency noise. Cavity 
pulling, which has been described already, is an important effect, and to minimize 
it, the cavity must be tuned carefully. The collision-induced spin-exchange process 
gives a frequency shift that varies with Qz; in the same way as the cavity pulling. 
Thus, the cavity-tuning procedure also eliminates this shift. Collisions with the 
cavity walls produce an effect called the “wall shift,’ which is difficult to predict 
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and may be the ultimate limiting factor in the absolute precision of the maser 
frequency (Vessot and Levine 1970). This shift depends on the temperature and 
wall coating material. Its fractional value is about 107!!. The first-order Doppler 
effect cancels, but the second-order Doppler effect does not, because of its v?/c? 
dependence [see Kleppner et al. (1962)]. The fractional frequency shift is about 
equal to —1.4 x 1077, Finally, there is no first-order Zeeman effect in the (F = 1, 
Mr = 0)-to-(F = 0, Mr = 0) transition. However, the second-order Zeeman 
fractional-frequency shift is 2.0 x 10*B?, where B is the magnetic field in tesla. 


9.5.6 Local Oscillator Stability 


Local oscillator signals are generated by multiplying a signal from the locked 
oscillator of the frequency standard. The multipliers must have exceptional stability, 
as discussed in Sect. 7.2, to avoid the introduction of additional noise and drift. 
Imperfect multipliers are sensitive to vibration and temperature and may have 
modulation at harmonics of the power line frequency. In an ideal multiplier, a signal 
of the form of Eq. (9.112) is converted to 


V(t) = cos[27Mvot + Mo (t)] , (9.161) 
where M is the multiplication factor, vo is the fundamental frequency, and ¢ is 
the random phase noise of the frequency standard. If the phase noise is small, 
M¢(t)<1, then the single-sided power spectrum of V(t) is given by 

S,(v) = 8(v — Mvp) + M’ Sẹ (v — Mvo) , (9.162) 
where 6 is a delta function representing the desired signal, and Sy is the power 


spectrum of the phase noise. Thus, the noise power increases as the square of the 
multiplication factor. In the general case, S, can be written (Lindsey and Chie 1978) 


M” 


2 
n! 


S 0) = ôv -Mv) + >> [Sp — Mvo) * Sg(v — Mvo) *---], (9.163) 
n=1 


where the term in brackets contains n replications of the same function convolved 
together. When only the leading term in the summation is retained, Eq. (9.163) 
reduces to Eq. (9.162). The higher-order terms in Eq. (9.163) represent a series of 
approximately Gaussian components because of the repeated convolutions. The rms 
phase deviation of the multiplier output frequency, Mvo, is proportional to the rms 
voltage of the noise in the output bandwidth, that is, to the square root of the noise 
power. Thus, for the case represented by Eq. (9.162), the rms phase fluctuation is 
proportional to M. 


9.5 Phase Stability and Atomic Frequency Standards 447 
9.5.7 Phase Calibration System 


One way to check the integrity of an entire VLBI system is to inject into the front 
end of the receiver an RF signal that is independently derived from the frequency 
standard. The RF test signal can be derived by driving a step-recovery diode with, 
say, a 1-MHz signal from the frequency standard so as to generate a pulse train 
with l-us period. Such a signal has harmonics at 1-MHz intervals throughout the 
microwave region, all of which have the same phase at the reference intervals. When 
the RF band is mixed down to baseband, one of the injected harmonics can be 
made to appear at a convenient frequency of order 10 kHz. This is then compared 
with a reference signal from the frequency standard. The phase calibration signal 
can be continuously injected during VLBI recording since a low enough level can 
be used that it can be detected only by very narrowband filtering in the processor 
(~ 10-Hz bandwidth). The calibration allows one to compensate for variations such 
as those caused by thermal effects in cables (Whitney et al. 1976; Thompson and 
Bagri 1991; Thompson 1995). Similar methods are used in some connected-element 
interferometers. 


9.5.8 Time Synchronization 


The clocks at VLBI stations must be synchronized accurately enough to avoid 
time-consuming searches for interference fringes. Until around 1980, Loran C 
was widely used to monitor time at VLBI stations. Loran, an acronym for Long 
Range Navigation, is a system originally developed during World War II for ocean 
navigation (Pierce et al. 1948). The transmission frequency is 100 kHz. The relative 
time of arrival of signals from three stations defines the observer’s location on the 
Earth’s surface. For a detailed discussion of Loran C, see Frank (1983). Accuracies 
from a few hundred nanoseconds to a few tens of microseconds are possible, 
depending on the accuracy of the estimate of propagation time. 

The Global Positioning System (GPS) provides higher accuracy than Loran and 
has been used in almost all VLBI systems since the early 1980s. In the GPS system, 
the user receives signals at 1.23 or 1.57 GHz from a number of satellites whose 
positions are known and whose clocks are synchronized to Coordinated Universal 
Time (UTC; see Sect. 12.3). If timing measurements from four satellites are made, 
and corrected for propagation effects in the atmosphere, users can determine their 
positions in three coordinates and their clock errors. The accuracies available to 
civilian users have improved over about a decade from 100 ns in time (Parkinson 
and Gilbert 1983; Lewandowski et al. 1999) to ~ 1 ns (Rose et al. 2014), and further 
improvement down to 100 ps is expected (Ray and Senior 2005). An analysis of the 
time-transfer problem, including relativistic effects, is given by Ashby and Allan 
(1979). For general information on GPS usage, see, for example, Leick (1995). 
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For time scales of a year, the accuracy of timing from pulsar observations 
approaches 1 part in 10!4 (Davis et al. 1985). Ultimately, the best time transfer may 
be obtainable from the processed VLBI data (Counselman et al. 1977; Clark et al. 
1979). 


9.6 Data Storage Systems 


The basic consideration for any storage system is the representation of the signal and 
the method of incorporating the time information. Recording can be either analog or 
digital, and various data storage technologies are available. Here, we discuss only 
digital recording since the technologies involved are well suited to VLBI and are 
widely used. 

A basic parameter of a recording system is its data rate, v, (bits s™!). This 
parameter limits the number of bits that can be recorded in a given time and, thus, 
also the sensitivity of continuum observations in which the potential IF bandwidth 
is larger than v,/2N,, where N, is the number of bits per sample. The signal is 
represented by samples having Q quantization levels taken at f times the Nyquist 
rate. For N samples, there are Q" possible data configurations, which require a 
minimum of N log, Q bits. Therefore, as noted in Sect. 8.4.3, the maximum RF 
bandwidth is 


Vp Vp 


Av = —— = ——_. 
2BNp 2B log, Q 


(9.164) 


The signal-to-noise ratio obtained in time t is proportional to no v Avt, where ng 
is the quantization efficiency (see Table 8.3). From Eq. (9.164), 


VT 
now Avt = no TA À (9.165) 


If z is the recording time, vpt is equal to the number of recorded bits. The quantity 
no/ BN» thus provides an indication of the performance per bit, which it is 
desirable to maximize. For two- and four-level sampling, the obvious encoding 
schemes are one bit and two bits per sample, respectively. For three-level sampling, 
a problem arises since encoding one sample (one of three possible states) in two data 
bits (representing four possible states) is inefficient. Putting three samples into five 
bits or five samples into eight bits gives data rates of 1.67 and 1.60 bits per sample, 
respectively, compared with the theoretical optimum value of log, 3 = 1.585. The 
values of ng/./ ÊN, for various values of Q and 6, and several encoding schemes, 
are listed in Table 9.5. The highest signal-to-noise ratio is achieved with three-level 
sampling at the Nyquist rate, although two- and four-level sampling give almost the 
same performance. 
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Table 9.5 Performance of various signal representations as a function of number of quantization 
levels, sampling rate, and encoding format* 


1 


Signal representation No Np TEN 
Sampling at Nyquist rate (£ = 1) 

Two-level 0.637 1.0 0.637 

Three-level “Ideal” encoding? 0.810 1.585 0.643 
5 samples/8 bit 0.810 1.60 0.640 
3 samples/S bit 0.810 1.667 0.627 
1 sample/2 bit 0.810 2.0 0.573 

Four-level All products 0.881 2.0 0.623 
Low-level omitted 0.87 2.0 0.61 

Sampling at 2 x Nyquist rate (£ = 2) 

Two-level 0.74 1.0 0.52 

Three-level “Ideal” encoding? 0.89 1.585 0.50 
5 samples/8 bit 0.89 1.60 0.50 
3 samples/5 bit 0.89 1.667 0.49 
1 sample/2 bit 0.89 2.0 0.45 

Four-level All products 0.94 2.0 0.47 


ango = quantization efficiency; N, = number of bits per sample; 6 = oversampling factor. 
>N samples encoded in N log,3 bits. 


In addition to the encoding schemes discussed above, in which the number of 
bits required for a given number of samples is constant, one can also envisage 
a scheme in which the number of bits depends on the sample values, that is, a 
variable-length code. For example, D’ Addario (1984) has suggested encoding the 
+1, 0, and —1 values in three-level quantization as the binary numbers 11, 0, and 
10, respectively. It is possible to decode such a data string uniquely, since all one- 
bit representations begin with 0 and all two-bit representations with 1. The average 
number of bits per sample depends on the amplitude probability distribution of the 
signal waveform and the threshold level settings. For a given number of bits, the 
threshold settings that maximize the signal-to-noise ratio are generally not the same 
as those derived in Sect. 8.3, which are optimum for a given number of bits per 
sample. With D’ Addario’s encoding scheme, the best performance is achieved with 
the threshold set such that ng = 0.769 and N, = 1.370 bits per sample, giving 
a performance factor ņno/ y BN», equal to 0.657. Thus, an increase in sensitivity of 
about 3% compared with the use of the scheme with 1.6 bits per sample could be 
achieved. However, the effects of bit errors or interfering signals that change the 
amplitude distribution could be more serious. Finally, the data could be encoded 
statistically in large blocks that would allow a theoretically optimal value of N, of 
1.317 bits per sample, which, with nọ of 0.769, would give a performance factor of 
0.670 (D’ Addario 1984). 

In practice, the desirability of a simple encoding scheme and other design 
considerations have usually resulted in the choice of two-level quantization. All five 
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VLBI systems developed in the United States during the period 1968-1997 (Mark I, 
Mark II, Mark III, VLBA, and Mark IV) use two-level sampling, but for the last two 
of these, four-level sampling is also an option. For spectral line observations, where 
the bandwidth of the signal is small with respect to the bandwidth of the recording 
system, multilevel sampling is advantageous. Note that multilevel sampling is a 
more effective way of using recording capacity than sampling faster than the 
Nyquist rate (Table 9.5). 

Each data sample must have either an implicit or explicit time tag. Although an 
error rate of 107° in decoding the data bits is acceptable, a one-bit shift in the time 
axis can be a serious defect and is not acceptable. In virtually all recording systems, 
the data are blocked into records. Each new record begins at a precise time so that 
the temporal registration of the data stream can be recovered if it is lost during the 
previous record. These record lengths are: Mark I, 0.2 s (144,000 bits); Mark II, 
16.7 ms (66,600 bits); and Mark III, 5 ms (20,000 bits). In the Mark I system, which 
used standard computer tape format, the accuracy of recording was very high, and 
the time of any bit was obtained by counting bits from the beginning of the record 
and counting records from the beginning of the tape. In the Mark II system, which 
uses video cassette recorders (VCRs), the data are recorded with a self-clocking 
code, while in the Mark III system, which uses instrumentation recorders, the data 
transitions themselves serve as the clock. The characteristics of several systems are 
given in Fig.9.20 and Table 9.6. In all of these, the recording is in digital form, 
except for the Canadian system used during 1971-83. Wietfeldt and D’ Addario 
(1991) discuss the compatibility of some of these systems. 
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Fig. 9.20 Trends in VLBI recording system data rates (circles) and storage costs (squares). (left) 
Data rate in Gbits s~! (Gbps) vs. time for various systems. (right) Cost of data storage system in 
K$/Gbps. Note that before 2000, data storage was on magnetic tape and after 2000 on disk. From 
Whitney et al. (2013), courtesy of and © the Astronomical Society of the Pacific. 
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A VLBI processor has two main functions: (1) reproduction of smooth data streams 
and (2) cross-correlation analysis of the data streams. Before 2000, VLBI data 
were stored on magnetic tape. During that period, the data stream from a tape 
recorder could have time-base irregularities of up to 100 us, caused by jitter in 
the mechanical playback system, and could be subject to dropouts because of 
tape imperfections. The processor had to derive the true time base either from the 
encoded clock transitions, in the case of a self-clocking code, or from the data 
transitions themselves, when a bit synchronizer was used. There had to be enough 
buffer storage to handle at least the mechanical jitter. The geometric delay was 
corrected with minimal buffer space by shifting the playback time, thereby retaining 
the data on the tape until they were needed by the correlator. If the data were read in 
synchronism from the tapes, a buffer memory of sample capacity about 5 x 104 times 
the clock rate in megahertz would be needed for geometric delay compensation. 
Even today, with disk storage or transmission of data over fiber optic networks, 
some buffer storage is required. 

The major differences between the design of the correlation part of the processor 
for VLBI and for a conventional interferometer are related to the fact that in VLBI, 
fringe rotation and delay compensation are usually performed on the quantized and 
sampled signal. This leads to special problems, which we discuss here. Digitization 
of the signals introduces several signal-to-noise loss factors: ng, the loss factor 
associated with amplitude quantization of the recorded signals, discussed in 
Sect. 8.3; nr, the loss factor incurred by quantizing the phase of the fringe rotation 
waveform; ns, the loss factor incurred by inadequate sideband rejection as a result 
of the limited number of delays in the correlator; and np, the loss caused by 
compensating the geometric delay in discrete steps. 

Fringe rotation and delay compensation can be done on the analog signals at the 
telescope before recording. For example, the fringe rotation can be done at the tele- 
scopes by offsetting the local oscillators, as described in Sect. 6.1.6 for a connected- 
element array. The advantage of this arrangement is that only a real correlation func- 
tion (with both positive and negative delays) needs to be calculated (see Sects. 8.8 
and 9.1). Hence, only half the correlator circuits are required. Also, the sensitivity 
loss from a digital fringe rotator is not incurred. A disadvantage is that the output 
of the correlator must be averaged over a short enough interval to accommodate the 
residual fringe frequency of a source anywhere in the primary beams of the anten- 
nas. The maximum residual fringe frequency of a source at the half-power point of 
the primary beam is Avy ~ Dw, /d [see Eq. (9.11)], where D is the baseline length, d 
is the antenna diameter, and we is the angular velocity of the Earth in radians per sec- 
ond. Hence, the averaging time of the correlator output must be less than 1/(2 Ave); 
for example, it should not exceed 30 ms for a baseline equal to the Earth’s diameter 
and d = 25 m. The correlation functions can be averaged further after they have 
been passed through a fringe rotator, which removes the residual fringe frequency. 
Also, the unit at the telescope that continually changes the local oscillator frequency 
must be carefully designed so that full phase accountability is provided for 
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astrometric work. Further information on VLBI systems and processing algorithms 
can be found in Thomas (1981); Herring (1983), and Deller et al. (2007, 2011). 


9.7.1 Fringe Rotation Loss (nr) 


Fringe rotation is used to reduce to near zero the frequency of the fringe component 
of the correlated signals (see Sect. 6.1.6). Here we consider the fringe frequency 
to include the effect of offsets in the frequency standards. Fringe rotation in the 
processor can be implemented in a number of ways, as shown in Fig. 9.21. If the 
fringe rotator is placed after the correlator (Fig. 9.21a), then the correlation function 
from the correlator must be averaged over an interval short with respect to the fringe 
period. If the local oscillators at the antennas are offset to slow the fringes, so that 
only a little further adjustment is required after the correlator, then this scheme 
is convenient. Otherwise, the short averaging time required and the resulting high 
data rate from the correlator make this arrangement unattractive. Alternately, before 
correlation, one of the data streams can be passed through a digital single-sideband 
mixer that shifts the Fourier components of the signal by the appropriate fringe 
frequency, as shown in Fig.9.21b. The 90° phase shift in this mixer is difficult to 
implement without introducing spectral distortion, so this type of fringe rotator is 
rarely used (see also Sect. 8.7). The fringe rotation scheme shown in Fig. 9.2 1c is 
commonly used, but application of fringe rotation to the quantized signal introduces 
two complications. First, the fringe function with which the signal is multiplied must 
be coarsely quantized so as not to increase the number of bits per sample going to the 
correlator: this also applies to scheme (b). Second, the multiplication introduces an 
unwanted noise sideband, which is described below in Sect. 9.7.2. We now consider 
the first of these effects. 

The data stream is multiplied by a complex function F whose real and imaginary 
parts, Fr and ¥;, approximate cos@ and sing, where ¢@ is the desired phase 
function. In the simplest approximation, these functions are square waves with 
the appropriate frequency and phases. Thus, as shown in Fig.9.22, the quantized 
signal is multiplied by a fringe rotation function whose amplitude is constant but 
whose phase steps by 90° every quarter cycle instead of smoothly progressing. 
The resulting visibility function then has a phase component with a 90° sawtooth 
modulation at the fringe frequency. This resembles phase noise in which the phase 
is uniformly distributed between +45°. Therefore, the average signal amplitude is 
degraded by sin(z/4)/(2/4) = 0.900. Another approach to calculating the loss in 
signal-to-noise ratio is to calculate the harmonics in the fringe rotation function. The 
first harmonic of Fg or F; has an amplitude of 4/2 = 1.273. Only the signal mixed 
with the first harmonic appears in the processor output, since the other harmonics 
are removed by time averaging. Thus, part of the signal is scattered out of the fringe 
passband. The fraction retained is the square root of the ratio of the power in the first 
harmonic to the total power of the fringe rotation function, which is /8 /x = 0.900. 
This represents the loss in signal-to-noise ratio. There is also a scale-factor change 
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Fig. 9.21 Various processor configurations showing possible locations of fringe rotator. Fr and 
F; are cosine and sine representations of the fringe function. See text for discussion of relative 


merits. 


since the fringe amplitudes are increased by the action of the fringe rotator. Thus, 
the fringe amplitudes must be divided by 4/z, the relative amplitude of the first 


harmonic of Fr. 


A better fringe rotation function is the three-level approximation of a sine wave 
(Clark et al. 1972) shown in Fig. 9.22b. When the fringe rotation function is zero, 
the correlator is inhibited. Since the real and imaginary parts of F are never zero 
simultaneously, all data bits are used at least once. This fringe rotation function can 
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be thought of as a phasor whose tip traces out a square such that it has phase jumps in 
45° increments and its amplitude alternates between 2 and 1. The resulting jitter 
in phase is uniformly distributed between +22.5° and results in a loss of signal 
amplitude of sin(z/8)/(2/8) = 0.974. Also, the variation in the amplitude of the 
phasor introduces a nonuniform weighting of the signal samples. This reduces the 
signal-to-noise ratio by a further factor equal to (1 + /2)//6 = 0.986. The net loss 
in signal-to-noise ratio is 0.960. The reduction in signal-to-noise ratio is also equal 
to the square root of the ratio of the power in the first harmonic to the total power 
in Fr. The first harmonic of Fp is (4/7) cos(z/8) = 1.18, which is the scale factor 
correction for the visibility. The three-level fringe function considered here is used 
in many VLBI processors. The fringe period is divided into 16 parts to generate F. 
The transitions in F, which then occur at integral multiples of 1/16 of the fringe 
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period, are not optimally located, but this approximation results in no more than 
0.1% additional loss. Note that an FX correlator can be made to accept input data 
with more than one or two bits per sample rather more easily than a lag correlator. 
With more data bits per signal sample, more accurate representations of sine and 
cosine functions can be used. 


9.7.2 Fringe Sideband Rejection Loss (ns) 


The digital fringe rotator shown in Fig. 9.21c is not a single-sideband mixer. Thus, as 
well as the wanted output, shifted in frequency by the fringe frequency, an unwanted 
component of noise corresponding to the image response of a mixer also appears. 
To understand the effect of this noise, consider the cross power spectrum of the 
correlator output. Recall that v’ is the intermediate frequency defined following 
Eq. (9.18), and note that in the output of a spectral correlator, v’ > 0 and v’ < 0 
refer to the upper and lower sidebands, respectively. For upper-sideband operation, 
the cross power spectrum of the signal is given by Eq. (9.26), which is nonzero only 
for the upper sideband. However, there will be noise at both positive and negative 
frequencies. Thus, the cross power spectrum of the correlator output is 


SNe 4 ny (v’) , v>o0, 


7 T 
v) = 
= nev’), v <0, 


(9.166) 


where S(v’) is the instrumental response defined in Eq. (9.19), jØ(v’) is the 
exponent in Eq. (9.26), and n, and ne are the noise spectra for the upper- and lower- 
sideband responses. For observations in which a spectral line correlator is used, 
S',(v’) is computed and the noise at v’ < 0 is simply ignored. For continuum 
observations using a correlator with only a small number of channels (lags), the 
noise at v’ < 0 contributes excess noise in the correlation function and must be 
removed. A straightforward way to remove the noise at v’ < 0 is to compute S',(v’) 
and multiply it by the filtering function 


1, 0 "<A 
Hro’) = a a (9.167) 
0, elsewhere . 


The resulting function, S),(v’)Hr(v’), can be Fourier transformed back into a 
correlation function. Alternately, the filtering can be applied by convolving the 
correlation function at the output of the correlator with the Fourier transform of 
Hp(v’), which is 


(9.168) 
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or 
hp(t) = Fi(t) + jF) , (9.169) 


where F; and F, are as defined in Eq. (9.23). The convolution leaves the desired 
signal unchanged but removes the negative (lower) sideband noise. Thus, the 
resulting correlation function still has the form of Eq.(9.25), plus the positive 
(upper) sideband noise that cannot be removed. 

The role of hp (t) can be understood in a different way. The correlation function 
at the output of the correlator is computed at discrete delays at intervals of (2Av)~!. 
Therefore, the correlation function in Eq. (9.25) has a full width at half-maximum 
of about three delay steps. In order to estimate the amplitude and phase of the 
correlation function, one would like to do more than just take these values from 
the peak of p',(t). Rather, one would like to use all the information provided by 
the correlation function at various delays. h(t) is the appropriate interpolation 
function that properly weights the correlation function, gathering up the power 
at different delays to provide an optimal estimate of the fringe amplitude, phase, 
and delay. Note that h-(t) and p(t) are identical forms except for the unknown 
amplitude, phase, and delay. These unknown quantities can be estimated by the 
usual procedure of matched filtering or, equivalently, least-mean-squares analysis 
in which the correlation function is convolved with h(t). However, p},(t) is 
measured only over a finite number of delay steps, and some information is lost, 
so the signal-to-noise ratio is reduced. Assume that the system lowpass response is 
rectangular and the delay errors At, and Te are zero, so that the correlation function 
is centered in the delay range of the correlator. Let M be the number of delay steps 
(lags) in the correlator. The loss factor ns is the signal-to-noise ratio when M values 
of the correlation function are available, divided by the signal-to noise ratio when 
the entire function is available: 


MW 


D [he (te)? 


n= |Z, (9.170) 
D [hrl 


k=—o0o 


where tų = k/2Av, M' = (M — 1)/2, and M is an odd integer. The denominator in 
Eq. (9.170) equals 2Av?, so 


7 2 

1 sin(Zé 

T | =| (9.171) 
A ak 


k= 2 


For M = 1, ns = 1/2, which corresponds to the case of no image rejection. 
M must be at least 3 to ensure that the peak of the correlation function can be 
determined; M ~ 7, for which ns = 0.975, is adequate for most purposes. For 
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large M, ns approaches unity [see Eq. (A8.5)]. Note that because we assumed the 
correlation function was exactly centered, its value will be zero at delay steps 2, 
4, 6, 8,... , and so on. This suggests, for example, that a nine-delay correlator 
(M' = 4) is no better than a seven-delay correlator (M’ = 3). In practice, the nine- 
delay correlator is better because the correlation function is rarely aligned perfectly 
in the correlator. In general, ns is slightly smaller than given in Eq. (9.171) if the 
correlation function is not perfectly aligned (Herring 1983). 


9.7.3 Discrete Delay Step Loss (np) 


The delay introduced to align the bit streams is quantized at the sampling rate, which 
we assume to be the Nyquist rate. Thus, there is a periodic sawtooth delay error with 
a peak-to-peak amplitude equal to the sampling period. This effect is also known as 
the fractional bit-shift error. The delay error gives rise to a periodic phase shift that 
is a function of the baseband frequency, as shown in Fig. 9.23. The phase error has 
a peak-to-peak value of 


ae (9.172) 


and the sawtooth frequency is proportional to the fringe frequency and has a 
maximum value of 


2AvDwe 
Vds(max) = —— (delay steps per second) , (9.173) 


where D is the baseline length and we is the angular velocity of the Earth’s rotation 
in radians per second. If nothing is done to correct for this effect and the fringe 
amplitude is averaged over many times 1/vas, then the phase at any frequency v’ 
is uniformly distributed over pp. The amplitude loss as a function of baseband 
frequency is 


Liv’) = 2 f 9.174 
k fer” ae $pp/2 i 


and the net signal-to-noise reduction over a baseband response of width Av is, using 
Eqs. (9.172) and (9.174), 


1 4” sin(rv’/2Av) 


= — dv’ = 0.873 . 9.175 
Av Jo mv!’ /2Av : ( ) 


1D 


Unless the fringe amplitude averaging is done over an integral number of fringe 
periods, there is also a residual phase error, the amplitude of which decreases with 


460 9 Very-Long-Baseline Interferometry 


(a) 


Ww 
nw 
a 
=x 
a 
W 
” 
a 
x 
a 
BASEBAND FREQUENCY BASEBAND FREQUENCY 
1.0 1.0 sr | 
Ww 
O 
2 
= 
az 05 0.5 
= 
a 
0 dv o av 
BASEBAND FREQUENCY BASEBAND FREQUENCY 


Fig. 9.23 Discrete delay step effect. Case (a) applies when the fringe rotator corrects the phase for 
zero baseband frequency, and case (b) applies when the fringe rotator also inserts a 1/2 phase shift 
when the delay changes by one Nyquist sample. The top plots show the phase vs. time at baseband 
frequency v’. The middle plots show the phase across the baseband at three different times denoted 
by 1, 2, and 3. The bottom plots show the average amplitude across the baseband. 


the number of periods. When the fringe frequency is near zero, this phase error can 
be significant. 

The effect of the discrete delay step can be compensated for, and no sensitivity 
loss need occur. The delay error caused by delay quantization is a known quantity 
that introduces a phase slope in the cross power spectrum. Therefore, if the cross 
power spectra are calculated on a period short with respect to 1/vas, which can be 
as small as 20 ms on a 5000-km baseline with Av = 20 MHz [see Eq. (9.173)], 
then the effect of the discrete delay step can be removed by adjusting the slope of 
the phase of the cross power spectrum. This correction is easily done in spectral 
line work where spectra are calculated anyway. Note that if this correction is not 
made, the sensitivity loss factor is 0.64 at the high-frequency edge of the band, as 
given by Eq. (9.174). In this case, the amplitude response should be compensated 
for by dividing the cross power spectra by L(v’). In continuum work, the correction 
is sometimes omitted because of the need to Fourier transform to the frequency 
domain and then back to cross-correlation. 
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A way to compensate partially for the effect of discrete delay steps is to move the 
frequency at which the phase is unperturbed from zero to Av /2, the baseband center. 
The phase of the fringe rotator is increased by AvAt;, where At, is the delay 
error. Thus, when the delay changes by one sampling interval, a phase jump of 2/2 
is inserted in the fringe rotator. The resulting loss at the band edges is then only 0.90. 
The average loss over the band is given by an equation similar to Eq. (9.175), but 
with the upper limit of integration changed to Av/2, and equals 0.966. Also, for 
a symmetrical bandpass response, the residual phase error is zero because the net 
phase shift over the band at any instant is zero. 


9.7.4 Summary of Processing Losses 


The loss factors we have considered are all multiplicative, so the total loss is given 
by the equation 


n = No NRNSND , (9.176) 


where yg = quantization loss, nr = fringe rotation loss, ns = fringe sideband 
rejection loss, and 7p = discrete delay step loss. 

If there are fringe rotators in each signal path to the correlator, the fringe rotation 
loss will be 7% because the fringe rotator phases will be uncorrelated. A summary 
of the loss factors is given in Table 9.7. As an example, a processor might have 
two-level sampling (79 = 0.637), three-level fringe rotators in each signal path 
(nr = 0.922), 11-channel correlation function (ns = 0.983), and band-center delay 
compensation (np = 0.966), giving a net loss of 0.558. Thus, the sensitivity is worse 
than that of an ideal analog system with the same bandwidth by a factor of about 2. 

There are other loss factors we have not discussed here. The passband will not in 
reality be perfectly flat, or the response zero, for frequencies above half the Nyquist 
sampling frequency. These imperfections introduce loss, which for an ideal nine- 
pole Butterworth filter amounts to 2% (Rogers 1980). The frequency responses will 
not be perfectly matched for different antennas (see Sect. 7.3). The phase settings of 
the fringe rotator may be calculated exactly at convenient intervals and extrapolated 
by Taylor series; this approximation will introduce periodic phase jumps. The local 
oscillators may have power-line harmonic and noise sidebands that put some fringe 
power outside the usual fringe filter passband. Empirical values of n typical of the 
first decade of VLBI development were about 0.4 (Cohen 1973). 

The 7 values refer to loss in signal-to-noise ratio. The fringe amplitudes must 
also be corrected for scale changes due to signal quantization and fringe rotation. 
We summarize the multiplicative normalization factors to be applied to the fringe 
amplitudes in Table 9.8. 
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Table 9.7 Signal-to-noise loss factors 


Quantization loss (no)* 
(a) Two-level 

(b) Three-level 

(c) Four-level, all products 
Fringe rotation loss (nr) 
(a) Two-level, one path 
(b) Three-level, one path 
(c) Two-level, both paths 
(d) Three-level, both paths 


0.637 
0.810 
0.881 


0.900 
0.960 
0.810 
0.922 


Fringe sideband rejection loss (ns) 


(a) 1 channel 

(b) 3 channels 

(c) 7 channels 

(d) 11 channels 

Discrete delay step loss (np) 
(a) Spectral correction 

(b) Baseband center correction 
(c) No correction 


è See Sect. 8.3. 


Table 9.8 Normalization factors* 


Quantization 

(a) Two-level 

(b) Three-level 

(c) Four-level 

Fringe rotation 

(a) Two-level, one path 
(b) Three-level, one path 
(c) Two-level, both paths 
(d) Three-level, both paths 


0.707 
0.952 
0.975 
0.983 


1.000 
0.966 
0.873 


1,57. 
1.23 
1.13 


0.786 
0.850 
0.617 
0.723 


*Multiply correlator output by listed value to obtain 


normalized correlation function. 
>See Sect. 8.3. 
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For geodetic and astrometric purposes, it is useful to measure the geometric group 


delay 


_ 1 ag 


On Ov 


(9.177) 


as accurately as possible. With a single RF band, the delay can be found by fitting a 
straight line to the phase vs. frequency of the cross power spectrum. The uncertainty 
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in this delay, from the usual application of least-mean-squares analysis, is 


Op 


Zee 9.178 
27 Å Vims ( ) 


Or 


where og is the rms phase noise for a bandwidth Av and Avym, is the rms bandwidth, 
which for a single band of width Av is equal to Av/(2/3) [see discussion 
following Eq.(A12.28) in Appendix 12.1]. og can be obtained from Eq. (6.64), and 
if processing losses are neglected, Eq. (9.178) becomes 


Ts 


£%/Avs,t 


where ¢ is a constant equal to 1(768)!/4 ~ 16.5 [see derivation of Eq. (A12.33)], 
and J; and T4 are the geometric mean system and antenna temperatures. A much 
higher value of Av,ms can be realized by observing at several different radio 
frequencies. This can be accomplished by switching the local oscillator of a single- 
band system sequentially in time among N frequencies, or by dividing up the 
recorded signal into N simultaneous RF bands (channels), which are spread over a 
wide frequency interval. The temporal switching method has the disadvantage that 
phase changes during the switching cycle degrade or bias the delay estimate. These 
methods are commonly referred to as bandwidth synthesis (Rogers 1970, 1976). 

In a practical system, signals from a small number of RF bands (about ten) are 
recorded. The problem of determining the optimum distribution of these bands in 
frequency is similar to the problem of finding a minimum-redundancy distribution 
of antenna spacings in a linear array, as discussed in Sect. 5.5. However, here 
we do not need to have all multiples of the unit (frequency) spacing up to the 
maximum value, and some gaps are not necessarily detrimental. From the spectral 
point of view, we wish to have the bands placed in some geometric sequence of 
increasing separation so that phase can be extrapolated from one band to the next, 
as shown in Fig. 9.24, without having any 27 ambiguities in the phase connection 
process. The rms bandwidth depends critically on the unit spacing, which depends 
on the minimum signal-to-noise ratio. The delay accuracy for a multiband system 
is obtained from Eq. (9.178) in the same way as for Eq. (9.179) but without the 
condition Avyms = Av/ (2/3). Thus, we obtain 


Or = (9.179) 


Ts 


ANAN 9.180 
24/27 Ta V AVT A Vms ( ) 


Or 


where AV;ms for a typical bandwidth synthesis system is approximately 40% of the 
total frequency interval spanned, Av is the total bandwidth, and t is the integration 
time for each band. To avoid explicitly the problem of phase connection, we can 
form an equivalent delay function from the cross power spectra [see Eq. (9.26)] of 
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Fig. 9.24 Fringe phase vs. frequency for a bandwidth synthesis system. The phase is measured 
over discrete bands (crosshatched) spaced at multiples of the fundamental band separation 
frequency vs. The turn ambiguities give rise to sidelobes in the delay resolution function defined 
in Eq. (9.181) and shown in Fig. 9.25. 


the various bands observed: 


N Av 
Da(t) = 5 A Sizo — vde ™ dy , (9.181) 


i=1 


where the v; are the local oscillator frequencies relative to the lowest one, and v — v; 
is the baseband frequency. The maximum of |Dr(t)| gives the maximum-likelihood 
estimate of the interferometer delay (Rogers 1970). The a priori normalized delay 
resolution function, obtained from Eq. (9.181) by setting Sı2 = 1 at frequencies 
where it is measured and S;7 = 0 otherwise, is 


: N 

sin 7 Avt mvt 

|Dr(t)| = ae ) ef ‘ è (9.182) 
=1 


The sinc-function envelope is the delay resolution function for a single channel. The 
frequencies v; should be chosen to minimize the width of Dr(t) while not allowing 
any subsidiary maximum to rise above a level such that it could be confused with 
the principal peak. In situations with low signal-to-noise ratio, the minimum unit 
spacing should be about four times the bandwidth of a single channel. The delay 
resolution function for a five-channel system is shown in Fig. 9.25. 
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Fig. 9.25 Delay resolution function for a five-channel system with a unit spacing v, = 4Av and 
spacing of 0, 1, 3, 7, and 15 vs, as shown in part in Fig. 9.24. The “grating” lobe at tAv = 0.25 
need only be reduced sufficiently below unity to avoid delay ambiguity. 


9.8.1 Burst Mode Observing 


For certain observations, there are advantages in limiting the observing time to short 
bursts, during which the bit rate can be much higher than the mean data acquisition 
rate as limited by the recording technology [see, e.g., Wietfeldt and Frail (1991)]. 
In pulsar observations, the duration of the pulsed emission is typically ~ 3% of 
the total time, so by recording data taken only during pulse-on time, the bandwidth 
can be increased by a factor of ~ 33 over the maximum bandwidth for continuous 
observation. This technique requires the use of a high-speed sampler, high-speed 
memory, and pulse-timing circuitry at each antenna. During the pulse, the data are 
stored in the memory at the high rate and then read out continuously at a lower rate. 
If the ratio of these two rates is a factor w, then the bandwidth can be increased by the 
same factor over constant-rate observing. For pulsars, this results in an increase in 
sensitivity by a factor w, of which „/w can be attributed to the increased bandwidth, 
and ./w to the fact that noise is not being recorded during the pulse-off time. The 
second of these ./w factors can be obtained without an increase in the data rate 
by simply deleting data during the pulse-off periods. Burst mode observing is also 
useful for astrometry and geodesy because it increases the accuracy of measurement 
of the geometric delay, and it has been used for this purpose in observations of 
continuum sources at millimeter wavelengths. 
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9.9 Phased Arrays as VLBI Elements 


A phased array is a series of antennas for which the received signals are combined, 
as indicated in Fig. 5.4. Such systems could be used to form multiple beams, but we 
describe only the case of forming a single beam. The phase and delay of the signal 
from each antenna can be adjusted so that the signals from a particular direction 
in the sky combine in phase, thereby maximizing the sensitivity. It is important 
to consider the use of phased arrays as VLBI elements for two reasons. First, the 
elements of a connected-element synthesis array can be combined to form a phased 
array, thus improving the signal-to-noise ratio of a very-long-baseline interferometer 
in which they participate as a single station. Second, if elements with very large 
collecting area are desired to achieve a high signal-to-noise ratio on each baseline, it 
may be advantageous to build phased arrays rather than monolithic antennas because 
the cost of a parabolic reflector antenna increases approximately as the diameter to 
the power 2.7 (Meinel 1979). 

Synthesis arrays such as the Westerbork Array, the VLA, the SMA, the Plateau 
de Bure interferometer, and ALMA are also used as phased arrays to provide a large 
collecting area for one element in a VLBI system or other applications. Phasing the 
array consists of adjusting the phase and delay of the signal from each antenna so 
as to compensate for the different geometric paths for a wavefront from the desired 
direction. These corrections are easily made through the delay and fringe rotation 
systems that are used for synthesis imaging. The signals are then summed and go to 
a VLBI recorder. 

We can analyze the performance of a phased array that is used to simulate a 
single large antenna. Consider an array of n, identical antennas for which the system 
temperature is 7; and the antenna temperature for a given source that is unresolved 
by the longest spacings in the array is T4. The output of the summing port is 


Veum = Yo (si + €), (9.183) 


l 


where s; and €; represent the random signal and random noise voltages, respectively, 
from antenna i. Now (s;) = (€;) = 0 and, omitting constant gain factors, we 
2) = Ta and (e?) = Ts. The power level of the combined signals is 


can write (s; 
represented as the average squared value of Eq. (9.183), 


(Vam) = J [(sis) + (ies) + (sjei) + (6:6) - (9.184) 
ij 


If the array is accurately phased, s; = sj. Also, since we are considering an 
unresolved source, (sj;s;) = T4. If the array is unphased, that is, if the signal phases 
at the combination point are random, then (s;s;) = T4 only for i = j and is otherwise 
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zero. In either case, (sj¢;) = 0 and (eje;) = 0. Thus, Eq. (9.184) can be reduced to 


(Vs 


sum 


) = n2T + nals (array phased) (9.185) 
(V2 im) = Mala + NaTs (array unphased) , (9.186) 


where the first term on the right side represents the signal and the second term 
represents the noise. When the array is phased, the signal-to-noise (power) ratio is 
naTa /Ts, and when it is unphased, it is T4 /Ts. Thus, the collecting area of the phased 
array is equal to the sum of the collecting areas of the individual antennas, but when 
it is unphased, it is, on average, equal to that of a single antenna. 

A question of interest concerns the case in which the antennas have different 
sensitivities resulting from different effective collecting areas and/or system temper- 
atures. This is a matter of practical importance even for nominally uniform arrays, 
since maintenance or upgrading programs can result in differences in sensitivity. 
Consider a phased array in which the individual system temperatures and antenna 
temperatures are represented by 75; and 74;, respectively. Here, Ta; is defined as 
the signal from a point source of unit flux density,” so T4; is a characteristic of the 
antenna alone and is proportional to the collecting area. We consider only the weak- 
signal case for which 74 < Ts. For antenna i, the output voltage from a source of 
flux density S is V; = s; + €;, and we can write (s?) = ST4; and (e€?) = Tsi. 

It is convenient to think of the output of each antenna as providing a measure 
of the flux density of the source, which is equal to V?/T4;. The expectation of 
the measured value of S should be the same for each antenna. The corresponding 
voltages are VS = V,A/Tq; for the signal and €;A/T,; for the noise. In the cross- 
correlation of the array output with another VLBI antenna, the signal-to-noise ratio 
at the correlator output is proportional to the signal-to-noise voltage ratio of the 
signal from the array. Thus, in combining the signal voltages in the array, we are, 
in effect, interested in maximizing the signal-to-noise ratio in an estimate of JS. 
Because the array antennas are not identical, we should use weighting factors w; in 
combining their signals. The weights should be chosen to maximize the signal-to- 
noise ratio of the combined array signals which, in voltage, is 


(9.187) 


Note that we add the signal voltages and the squares of the rms noise voltages. 
Selecting the weights to provide the best signal-to-noise ratio for V;A/Tai is 
mathematically equivalent to the general problem of obtaining the best estimate of 


?Since it is only the relative values of the weighting factors that matter, Ta; could be defined 
with respect to any source that is common to all antennas, but consideration of unit flux density 
simplifies the explanation. 
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a measured quantity from a series of measurements for which the rms error levels 
are different but are known. The optimum procedure is to take a mean in which the 
weight of each measurement is inversely proportional to the variance of the error of 
that measurement [see Eq. (A12.6)]. The variance of V; is proportional to 7s5;, and 
thus the variance of V;A/Ta; is Ts;/T4;. Thus, we insert w; = T4;/Ts; in Eq. (9.187) 
and obtain 


(9.188) 


Note that in the numerator, V; is multiplied by ./74;/Ts;, which is therefore the 
(voltage) weighting factor for optimum sensitivity in the signal combination. This 
conclusion is in agreement with an analysis by Dewey (1994). (Note that the 
weighting factors for the signal voltages at the combination point are not w; but 
w;A/Tai-) The corresponding weighting of the signal power at the combination 
point is proportional to T4;/ Te: 

In synthesis arrays such as the VLA, the IF signals from the antennas are each 
delivered to a digital sampler at the same power level (of signal plus noise), and 
the signals are combined after that point so that the time delays required can be 
inserted digitally. Thus, to avoid modifying the receiving system (which is designed 
for synthesis imaging), the signals are combined with equal powers when the array 
is used in the phased mode. For the case of JT, « T; that we are considering, the 
corresponding weighting is w; = 1A/7s;, and the signal-to-noise ratio becomes 


V; 1 
Ron = Dae / LF: (9.189) 


Equal-power weighting usually provides sensitivity within a few percent of opti- 
mum weighting. 

With optimum weighting in the signal combination, all antennas make some 
contribution to increasing the signal-to-noise ratio. With other weighting, the overall 
sensitivity may be improved by omitting antennas with poor performance. Moran 
(1989) has investigated this effect for equal-power weighting. To simplify the 
situation, it was assumed that 7, is the same for all antennas and only T; varies. 
Consider an array undergoing an upgrade of the receiver input stages, in which 
a fraction nı/na have been refitted with new input stages that reduce the system 
temperature from Ts to 7;/&. After a certain fraction of the antennas have been 
refitted, the array sensitivity is improved by omitting the unimproved antennas 
because their input stages are noisier. When 74 does not vary, we can represent 
the signal voltage received by each antenna by V, and Eq. (9.189) for equal-power 
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weighting becomes 


1 


V 
Rn = = = 9.190 
= RETR am 


Thus, we can write 


Ren2(n refits only) = nVJé 1 nJé i Ng — ny 
Rsm (all na antennas) /m \ VI J'a \ VIs Vr J` 
(9.191) 


The unimproved antennas should be omitted if the expression above is greater than 


unity, which occurs for 
Z -2 
"($ 1— yE + $) . (9.192) 


Figure 9.26 shows nı /nq as a function of £. Thus, for example, if the refitting reduces 
T; by a factor of six, then when about half the antennas have been refitted, the 
others should be omitted. However, unless € > 4, all antennas should be retained. 


a 


0.8 
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0.2 


Fig. 9.26 The fraction of antennas, nı/na, in a phased array with equal-power weighting, for 
which the system temperature must be reduced by a factor € before the remaining antennas should 
be omitted. From Moran (1989), © Kluwer Academic Publishers. With kind permission from 
Springer Science and Business Media. 
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Fig. 9.27 Examples of the phased-array operation of the SMA at 280 GHz on the source 3C354.3, 
which had a flux density of 10 Jy. The phasing efficiency is the ratio of the sum of the pairwise 
visibilities divided by their scaler sum. Seven antennas of the SMA were used in the extended 
configuration with baselines between 44 and 226 m. The weather conditions were: clear sky, 
1.3 mm of precipitable water vapor, and wind speed of 2 m/s. The elevation angle range was 
44-50° in the left panel and 65—71° in the right panel. The improving atmospheric stability with 
increasing elevation angle and time since sunset is evident. Adapted from Young et al. (2016). 


In practice, a factor of four would be an unusually big improvement, so it can 
be concluded that omitting antennas is rarely useful. A similar analysis based on 
Eq. (9.188) shows that with optimum weighting, the sensitivity is never improved 
by omitting antennas. 

For VLBI, the output of a phased array is usually requantized to fit the recording 
format. The first quantization of the signals, before they are combined, introduces 
quantization noise that, after combination, has a probability distribution that tends 
to Gaussian as the number of antennas becomes large. Thus, for such arrays, the 
additional loss in sensitivity in requantizing is close to the values of ng derived in 
Chap. 8, for which Gaussian noise is assumed. For other cases, see Kokkeler et al. 
(2001). 

The phasing of the SMA is described by Young et al. (2016) and of ALMA 
by Baudry et al. (2012). An example of a phased array in operation is shown in 
Fig. 9.27. 


9.10 Orbiting VLBI (OVLBI) 


The basic requirements for a VLBI station, whether orbiting or terrestrial, include 
a timing system so that the time associated with each digital sample of the received 
signal is recoverable, and a position for the antenna known with sufficient accuracy 
that the fringe frequency (but not necessarily the fringe phase) can be determined. 
The timing system must be stable to a fraction of the period of the received 
signal frequency over a coherence time of tens or hundreds of seconds. If it is not 
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possible to put a precise frequency standard on a satellite, then a timing link of 
equivalent stability must be implemented. Establishing this timing system, which 
provides the local oscillators and the sampling clock at the satellite, is a major 
technical challenge in OVLBI. The radial motion of the satellite introduces Doppler 
shifts, and the tangential motion causes the link path to move relative to the 
atmospheric irregularities. One or more reference frequencies are transmitted to the 
satellite over a radio link. The position of the satellite at any time is known from 
standard orbit-tracking procedures to an accuracy of some tens of meters. This is 
sufficient to determine the (u, v) coordinates of the baseline but not sufficient for the 
timing accuracy required. To solve the timing problem, a round-trip phase system 
implemented by radio link is required. This is identical in principle to the round-trip 
systems for cables discussed in Sect. 7.2. A discussion of the basic requirements of 
the timing system is given by D’ Addario (1991). 

Figure 9.28 shows a simplified example of a system at the satellite and Earth 
station, which illustrates the essential functions. In this case, a frequency standard 
is not included in the satellite. A frequency standard in the Earth station provides 
a reference frequency to synthesizer Sy, from which a signal is transmitted to the 
satellite. This signal provides a reference for synthesizers S,, Sz, and S, that produce 
signals for the round-trip phase measurement, the local oscillator (LO) of the radio 
astronomy receiver, and the sampling clock, respectively. The signal from Sj is 
radiated to the Earth station, where its phase is compared in a correlator with a 
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Fig. 9.28 Simplified block diagram of the basic signal transmission and processing required on 
an OVLBI spacecraft and at the Earth station. See text for further explanation. © 1991 IEEE. 
Reprinted, with permission, from L. R. D’ Addario (1991). 
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locally generated signal at the same frequency. The correlator output is a measure 
of At, the change in the time delay of the round-trip path. The signal from the 
radio telescope on the spacecraft goes to a low-noise amplifier, a filter, and a mixer 
in which it is converted to intermediate frequency (IF) by the LO signal from Sz. 
The IF signal then goes to an IF amplifier, a sampler (represented by a switch), 
and a quantizer, Q(x). The counter n is driven by the sampler clock signal from 
synthesizer S, and provides timing signals. These provide a record of when each 
data point was taken, information for formatting the data, and other timing functions 
required on the satellite. The counter nę provides timing at the ground location. 
Some complications with the operation of the scheme just outlined are: 


1. The round-trip phase measures the length of the round-trip path with an ambi- 
guity of an integral number of wavelengths. It provides a measure of changes in 
path length that are continuous. 

2. Unless the frequencies generated by the three synthesizers at the satellite are 
harmonics of one or more reference frequencies supplied (so that no frequency 
division is necessary in the synthesizers), then the phases of the frequencies will 
be ambiguous. 

3. The transmission times for the reference frequencies and the data may differ 
because of dispersion in the path or differences in the electronics. 


These limitations cause problems when there are discontinuities in the link 
contact between the satellite and the Earth station. If there is continuous contact 
during an observing period, then once fringes are found, the combined effect 
of the ambiguities is determined. The continuous monitoring of the variation of 
the path enables the solution to be extended throughout the observing period. 
However, if signal contact is lost due to interference, atmospheric effects, or 
equipment problems, phase-locked loops in the synthesizers lose lock, and a phase 
discontinuity will result when the signals are regained. If the round-trip tracking is 
interrupted for a long period, another fringe search of the data may be required. 

For any round-trip measurement, use of the same frequency in both directions 
would simplify the determination of the one-way propagation time, since the effects 
of dispersion would be largely eliminated. This would be technically feasible with 
time sharing or a very small frequency offset to allow signals in the two directions to 
be separated. However, the international radio regulations usually allocate different 
frequency bands for the two directions of transmission. Measurement of the round- 
trip path at two frequencies is therefore important in determining the relative 
contributions of the neutral and ionized media to the propagation time. If a high- 
stability frequency standard is included on a satellite, it could serve as the primary 
clock or as a backup to a radio-link timing system to help keep time at the satellite 
during link dropouts. Relativistic effects are a complication in the use of an onboard 
clock, causing its time to vary with respect to Earth-station clocks as the satellite 
moves through regions of differing strength of the Earth’s gravitational field (Ashby 
and Allan 1979; Vessot 1991). 

The first OVLBI experiment was carried out with a satellite in the NASA 
Tracking and Data Relay Satellite System (TDRSS), which was adapted for VLBI 
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use (Levy et al. 1986, 1989). The purpose of this geostationary satellite was to 
relay signals from low Earth orbit to ground stations. It was equipped with two 
4.9-m-diameter antennas, both with receivers at 2.3 and 15 GHz, and an up—down 
link communication system at 15.0 and 13.7 GHz. One of the 4.9-m-diameter 
antennas was used to receive the astronomical signals. The experiment provided 
limited astronomical data [see Linfield et al. (1989, 1990)] but proved to be an 
invaluable test bed for time and phase transfer techniques as well as data recovery 
and processing methods. It was necessary to time-tag the data at the ground station, 
and hence, the satellite range was part of the interferometer delay. The onboard 
oscillators were phase-locked via the timing link, much as described in the previous 
paragraph. However, the coherence of the interferometer was greatly improved by 
using the second 4.9-m-diameter antenna as part of a separate two-way link at 
2.278 GHz. The coherence of the interferometer at 2.3 GHz was found to be 0.98, 
0.95, and 0.94 for integration times of 100, 200, and 700 s, respectively. This shows 
the effective Allan variance of the whole interferometer system of better than 107! 
(see discussion in Sect. 9.5.2). 

The first satellite specifically designed for use as an orbiting element in a VLBI 
array was the HALCA satellite (VSOP project), launched in 1997, followed by 
the Spektr-R satellite (RadioAstron project), launched in 2011. Some of the key 
specifications of these satellites are listed in Table 9.9. Typical (u, v) plane tracks are 
shown in Fig. 5.22. RadioAstron has an onboard hydrogen maser frequency standard 
so that the timing transfer link is not required to synchronize the local oscillator. 
However, the search for fringes can be a significant task. With orbit position and 
velocity uncertainty of + 500 m and 20 mm/s, the delay uncertainty is about 30 ns, 
or the equivalent of about + 2000 delay steps, and the fringe rate uncertainty is 
+ 3 Hz at 6-cm wavelength. The processing must also include a fringe acceleration 
term. A description of a lag-type correlator designed specifically to include OVLBI 
stations is given by Carlson et al. (1999). 


9.11 Satellite Positioning 


The three-dimensional locations of geostationary satellites can be determined with a 
VLBI array because they lie within its near field (see Section 15.1.3). To understand 
the sensitivity of a VLBI array to the range, or altitude, of a satellite, consider the 
simplified geometry shown in Fig.9.29. In this exercise, the satellite is directly 
overhead at station | of a three-station linear array whose baseline is normal to 
the direction to the satellite. As a result of being in the near field, the curvature 
of the spherical wavefront of a broadband-transmitted signal from the satellite can 
be measured. Note that at least three stations are required, because with only two 
stations, wavefront tilt cannot be distinguished from wavefront curvature. For the 
purpose of this exercise, we assume that the bandwidth of the transmitted signal is 
broad enough that delays can be measured accurately since the signal-to-noise ratio 
can be expected to be very high. The accuracy of the measurement of R will be 
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Table 9.9 Parameters of orbiting VLBI stations 


Lead institution 


Launch date 


Orbital parameters“ 


HALCA (VSOP)? 
Institute of Space and 
Astronautical Science 
(Japan) 

February 12, 1997° 
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Spektr-R (RadioAstron)? 
Astro Space Center (IKI) 
of the Lebedev Physics 
Institute (Russia) 

July 18, 2011 


Semimajor axis 17,350 km 174,714 km 

Eccentricity 0.60 0.69 

Inclination 31° 80° 

Period 6.3h 8.3 days 

Apogee height 21,400 km 289,246 km 

Perigee height 560 km 47,442 km 
Maximum resolution 580 uas (A = 6 cm) 8 uas (A = 1.3 cm) 
Orbital determination® +15 m, 6 mm/s + 500 m, 20 mm/s 
Antenna diameter 8m 10m 
Slew rate 2.25°/min 2°/min 
Pointing accuracy 1 1.5’ 
Polarization LCP RCP/LCP 
Operating bands 6, 18 cm! 1.3, 6, 18, 92 cm 
Tsys 95,75 K 127, 147, 41, 145 K 
Aperture efficiency 0.35, 0.24 0.10/0.45/0.52/0.38 
Channels/bandwidth 2 x 16 MHz 4 x 16 MHz 
Sampling 2 bit 1 bit 
Total data rate 128 Mbits/s 128 Mbits/s 
Onboard frequency std. Crystal oscillation® Hydrogen maser 
Timing transfer 15.3/14.2 GHz 8.4/7.2 GHz 
Ground stations Usuda (Japan) Pushchino (Russia) 

Goldstone (USA) Green Bank (USA)! 
Green Bank (USA) 


Satellite control 


Robledo (Spain) 
Tidbinbilla (Australia) 
Kagoshima (Japan) 


Bear Lake (Russia) 


“Information from Hirabayashi et al. (1998, 2000) and Kobayashi et al. (2000). 


Information from RadioAstron Science and Technical Operations Group (2015) and Kardashev 
et al. (2013). 

“Operational until 2003. 

‘HALCA: The argument of perigee and longitude of ascending node have periods of 1 and 1.6 
years, respectively. RadioAstron: Orbit on April 14, 2012, after repositioning to an orbit of lower 
eccentricity. Orbit subject to perturbations by the Sun and Moon; eccentricity varies between 0.58 
and 0.96. 

° Accuracy of reconstructed orbit available ~ 2 weeks after observation determined from Doppler 
tracking and orbital analysis. 

f1.35-cm channel not used because of poor sensitivity. 

®Phase-locked to uplink signal. 

h Available as backup to onboard hydrogen maser. 

‘See Ford et al. (2014). 
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Fig. 9.29 Simplified Satellite 
geometry for tracking a 
geostationary satellite with a 
three-station VLBI array. 
Because the satellite will be 
in the near field of the array, 
i.e., R  D?/A for typical 
values of D and å, the 
wavefront curvature can be 
measured. 
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limited by the effects of the atmosphere and ionosphere. Phase measurement could 
also be used but might be subject to phase ambiguities. 

The excess geometric path length, x, to station 2 or 3 with respect to station 1 is 
determined by the relation (R + x)? = R? + D?. To first order, x ~ D? /2R, and the 
delay is t = x/c = D? /2Rc. 

Taking the differential of this expression gives the result that the sensitivity of 
the delay, Art, to the sensitivity in range, AR, is 


iy ee 
At = — (2) AR. (9.193) 
2c \R 


This expression can be used to determine the range accuracy, given the accuracy of 
the delay measurement. 

Now consider the limitations imposed by the atmosphere. Normal astrometric 
positioning can be done to an accuracy of og, which implies an uncertainty in delay 
of o, ~ Dog/c. Hence, the uncertainty in range op is given by 


2R? 
OR = p” ; (9.194) 


while the uncertainty in the transverse direction, op,, iS 
OR, = Rog . (9.195) 


Hence, the relative accuracy of the longitudinal and transverse position is 


St a9(F). (9.196) 


Orr D 
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Consider the following example, with parameters R = 36,000 km, D = 3600 km, 
à = 3 cm (a typical wavelength for geostationary satellites), and og = 100 jas. 
From the above equations, we obtain o, = 0.2 ns (which corresponds to an rms 
phase uncertainty of about 20°), og = 40 cm, og, = 2 cm, and or/or, = 40. The 
position can be determined from a single short observation without reliance on Earth 
rotation. The velocity of the satellite can be determined from the rate of change of 
position parameters. For an operational system, at least four systems are required 
since three are needed to define a reference plane. The earliest attempt to measure a 
satellite position with VLBI was done by Preston et al. (1972). 
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Chapter 10 
Calibration and Imaging 


This chapter is concerned with the calibration and Fourier transformation of 
visibility data, mainly as applied to Earth-rotation synthesis. Methods for the 
evaluation of the visibility measurements on a rectangular grid of points, necessary 
for the use of the discrete Fourier transform as implemented with the fast Fourier 
transform (FFT) algorithm, are discussed. Phase and amplitude closure conditions, 
which are valuable calibration tools, are also described. Analysis of the causes of 
certain types of image defects is given. Special consideration is given for certain 
observing modes, such as spectral line, and conversion of frequency to velocity is 
described. In addition, methods of extracting astronomical information directly from 
visibility data by model fitting are described. These techniques are important even 
with arrays having excellent (u, v) coverage. Some methods of calculating Fourier 
transforms before the advent of the FFT are discussed in Appendix 10.3. 


10.1 Calibration of the Visibility 


The purpose of calibration is to remove, insofar as possible, the effects of instrumen- 
tal and atmospheric factors in the measurements. Such factors depend largely on the 
individual antennas or antenna pairs and their associated electronics, so correction 
must be applied to the visibility data before they are combined into an image. 
Editing the visibility data to delete any that show evidence of radio interference 
or equipment malfunction is usually performed before the full calibration process. 
This largely entails examining samples of data for unexpected amplitude or phase 
variations. Data taken on unresolved calibration sources are particularly useful here, 
since the response to such a source is predictable and should vary only slowly and 
smoothly with time. 
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In the calibration procedure, we first consider instrumental factors that are stable 
with time over periods of weeks or more. These include: 


1. antenna position coordinates that specify the baselines, 

2. antenna pointing corrections resulting from axis misalignments or other mechan- 
ical tolerances, 

3. zero-point settings of the instrumental delays, that is, the settings for which the 
delays from the antennas to the correlator inputs are equal. 


These parameters vary only as a result of major changes such as the relocation of 
an antenna. They can be calibrated by observing unresolved sources with known 
positions (see Sect. 12.2). We assume here that they have been determined in 
advance of the imaging observations. We also assume that correction for the 
nonlinearity of signal quantization, which is discussed in Sect. 8.4, has been applied 
if required. 


10.1.1 Corrections for Calculable or Directly Monitored 
Effects 


Calibration of the visibility measurements for effects that vary during an observation 
principally involves correction of the complex gains of the antenna pairs. Such 
factors can be divided into those for which the behavior can be predicted or directly 
measured and those for which it must be determined by observing a calibration 
source during the observation period. Examples of effects that can be corrected for 
by calculation of their effects include: 


1. the constant component of atmospheric attenuation as a function of zenith angle 
(see Sect. 13.1.3), 

2. variation of antenna gain as a function of elevation caused by elastic deformation 
of the structure under gravity. This may be based on pointing observations as 
well as structural calculations. 


Shadowing, in which one antenna partially blocks the aperture of another, can occur 
at close spacings and low elevation angles. In principle, it is a problem that should be 
calibratable, since the positions and structures of the antennas are known. However, 
the effect of the geometrical blockage is complicated by diffraction, the shape of 
the primary beam is modified, and the position of the phase center of the aperture is 
shifted, thus affecting the baseline. Overall, these effects are often too complicated 
to be analyzed, and data from shadowed antennas are often discarded. 

Effects within the receiving system, or external to it, that can be continuously 
monitored during an observation include: 


1. variation of system noise temperature, which can result from changes in the 
ground radiation picked up in the sidelobes as the antenna tracks or from changes 
in atmospheric opacity. This effect may also cause variation in the gain as a result 
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of automatic level control (ALC) action that is used in some instruments to adjust 
the signal levels at the sampler or correlator (see Sect. 7.6). Monitoring can be 
performed by injection of a low-level, switched, noise signal at the receiver input 
and detection of it later in the system. 

2. phase variations in the local oscillator system monitored by round-trip phase 
measurement (see Sect. 7.2), 

3. the variable component of atmospheric delay monitored by using water vapor 
radiometers mounted at the antennas (see Sect. 13.3). 


Corrections for these effects are usually performed at an early stage of the 
calibration procedure. 


10.1.2 Use of Calibration Sources 


Further steps in the calibration involve parameters that may vary on timescales of 
minutes or hours and require the observation of one or more calibration sources. 
Note that the source that is the subject of the astronomical investigation will 
be referred to as the target source to distinguish it from the calibration source, 
or calibrator. From Eq. (3.9), we can write the small-field expression for the 
interferometer response as follows: 


œ fœ Ay(l, m)I(1, l 
[V (u, v)Juncat = Gmn (t) f f ee aan dd , (10.1) 
—00 J—CO = —m 


where [V(u, V)]uncal is the uncalibrated visibility, and /(/, m) is the source intensity. 
The complex gain factor Gmn(t) is a function of the antenna pair (m,n) and, as a 
result of unwanted effects, may vary with time. Ay is the antenna aperture normal- 
ized to unity for the direction of the main beam. It can be removed from the source 
image as a final step in the image processing. The factor Ay(I,m)/V 1 — P — m? is 
close to unity, and from here on, we generally omit it, except in the case of wide- 
field imaging. To calibrate Gmn(t), an unresolved calibrator can be observed, for 
which the measured response is 


Vu, v) = Gin (Se , (10.2) 


where the subscript c indicates a calibrator, and S, is the flux density of the calibra- 
tor. In calibrating the gain, it is best to consider the amplitude and phase separately, 
since the errors in these two quantities generally arise through different mechanisms. 
For example, atmospheric fluctuations due to tropospheric inhomogeneity cause 
phase fluctuations but have little effect on the amplitudes. To calibrate the visibility 
of the target source, we can write 


[V(u, v Juncal 
Ginn (t) 


Vu, v) E = [V (u, V)Juncal Ea : (10.3) 


Ve 
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To observe the calibration source, it is usually placed at the phase center of its field. 
Then assuming that the calibrator is unresolved, the phase is a direct measure of the 
instrumental phase. Thus, phase calibration for the target source requires subtracting 
the calibrator phase from the observed phase. The visibility amplitude can be 
calibrated by using the moduli of the visibility terms in Eq. (10.3). The response 
to the calibrator should be corrected for the calculable and/or directly monitored 
effects before the gain calibration is performed. Where there are separate receiving 
channels for two opposite polarizations at each antenna, the calibration should be 
performed separately for each one. For measurements of source polarization, further 
calibration procedures are necessary, as described in Sect. 4.7.5. 

Calibration observations require periodic interruption of observations of the tar- 
get source. At centimeter wavelengths, the interval between calibration observations 
depends on the stability of the instrument and typically falls within the range of 
15 min to 1 h. At meter and centimeter wavelengths, the ionosphere and the neutral 
atmosphere introduce gain and phase changes, and elimination of these may require 
observation of a calibrator at time intervals as short as a few minutes. At millimeter 
and submillimeter wavelengths, calibration at time intervals less than a minute is 
usually required. 

As indicated by Eq. (7.38), Ginn = 8mg;, so the measured gains for antenna 
pairs can be used to determine gain factors for the individual antennas. Using the 
individual antenna gain factors rather than the baseline gain factors reduces the 
calibration data to be stored and helps in monitoring the performance of individual 
antennas. Also, with this technique, some of the spacings can be omitted from the 
calibration observation so long as each of the antennas is included. In practice, 
gain tables including both amplitude and phase are generated for the antennas as 
a function of time, and the values are interpolated to the times at which data from 
the target source were taken. The interpolation should be done separately for the 
amplitude and phase, not for the real and imaginary parts of the gain; otherwise, the 
phase errors can degrade the amplitude, and vice versa. The desirable characteristics 
of a calibration source are the following. 


Flux density. The calibrator should be strong, so that a good signal-to-noise ratio 
is obtained in a short time, to reduce the (u,v) coverage lost from the target 
source. The gaps in the (u, v) coverage are more serious for a linear array, in 
which complete sectors are lost, than for a two-dimensional array, in which the 
instantaneous coverage is more widely distributed in u and v. 

Angular width. The calibrator should, if possible, be unresolved so that precise 
details of its visibility are not required. 

Position. The position of the calibrator should be close to that of the target 
source. Effects in the atmosphere or antennas that cause the gain to vary with 
pointing angle are then more effectively removed, and time lost in driving the 
antennas between the target source and calibrator positions is kept small. At 
millimeter wavelengths, where the atmospheric phase path is the main factor 
being calibrated, the calibrator distance must be within the angular scale of the 
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irregularities. This usually means a distance of no more than a few degrees on 
the sky (see Sect. 13.4). 


It is not always possible to find a calibrator that satisfies all of the above 
requirements. In such cases, it may be necessary to find a source that is largely 
unresolved and close to the target source and then calibrate it against one of the 
more commonly used flux density references such as 3C48, 3C147, 3C286, and 
3C295. The last of these is the most reliable with regard to nonvariability. Thermal 
sources such as the compact planetary nebula NGC7027 may be useful as amplitude 
calibrators for short baselines. At millimeter wavelengths, it may be more difficult 
to find a source that provides a strong signal for test purposes or calibration. Disks 
of planets become resolved at rather short baselines, but the limb of the Moon or a 
planet can be useful: see Appendix 10.1. 

The use of clusters of small sources as calibrators has been investigated by 
Kazemi et al. (2013). Such clusters might typically consist of two to ten sources of 
small angular diameter, and flux densities are correspondingly lower than required 
for single calibration sources. This approach allows calibrators to be found closer to 
the object under investigation and thus potentially increases the number available as 
well as reducing errors related to angular distance. 

For VLBI observations with milliarcsecond resolution, there are fewer suitable 
calibrators. Angular structure on this scale is sometimes variable over periods of 
months, and caution is necessary if a previously measured and partially resolved 
source is to be used as a calibrator. An alternative approach to amplitude calibration 
of VLBI data involves use of the system temperatures and collecting areas of 
the individual antennas, as follows. The cross-correlation data should first be 
normalized to unity for the case in which the two input data streams are fully 
correlated. To obtain this normalization, the data are divided by the product of 
the rms values of the data streams at the two correlator inputs. (For two-level 
sampling, this rms value is unity, and for other types of sampling, the rms depends 
on the setting of the sampler thresholds with respect to the level of the analog 
signal.) Then, to convert the normalized correlation to visibility V with units of 
flux density Ganskys), the amplitude is multiplied by the geometric mean of the 
system equivalent flux density (SEFD) values for the two antennas involved. The 
system equivalent flux density, SEFD = 2kTs/A, is defined in Eq. (1.7). If the 
value of Ts corresponds to a signal plane above the atmosphere, then the resulting 
visibility values will be corrected for atmospheric losses. For VLBI data in which 
the phase may sometimes not be calibrated, the closure relationships in Sect. 10.3 
allow images to be formed if absolute position is not required. 
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10.2.1 Imaging by Direct Fourier Transformation 


A straightforward method of obtaining an estimate of the intensity distribution from 
measured visibility data is by direct Fourier transformation, that is, by performing 
the transformation without putting the visibility into any special form such as 
interpolating it onto a uniform grid. The measured visibility Vimeas(u, v) can be 
written 


V meas(u, V) = W(u, v)wlu, v)V(u, v) , (10.4) 


where W(u, v) is the transfer function or spatial sensitivity function introduced in 
Sect. 5.3, and w(u, v) represents any applied weighting. The Fourier transform of 
Eq. (10.4) is the measured intensity distribution (i.e., the image), which is 


Imeas(l, m) = I(l, m) * * bo(l,m) , (10.5) 


where the double asterisk indicates two-dimensional convolution, and bọ is the 
synthesized beam, which is the Fourier transform of the weighted transfer function: 


bo(l, m) <—> W(u, v)w(u, v) , (10.6) 


where <—> indicates the Fourier transform relationship. Effects such as those of 
noncoplanar baselines, signal bandwidth, and visibility averaging are not included 
here. bo(/,m) is also known as the point-source response function or the dirty 
beam, in the context of the CLEAN deconvolution algorithm, which is discussed 
in Sect. 11.1. 

The visibility is measured at an ensemble of ng points in the (u, v) plane. If the 
antennas are identically polarized and the source is unpolarized, the direct Fourier 
transform of these data is represented by 


nd 


Imeas(l, m) = oa Wi [V meas (ui, vi)e 7il vim) + Vineas(—Uj, —v;)e Pruitt vim] . 
i=l 
(10.7) 


The fundamental issue in image synthesis is whether we can recover I(l, m) 
from Imeas(/,m). In principle, Eq.(10.4) can be used to determine V(u,v) as 
Vmeas(u, v)/W(u, v)w(u, v). The image can be calculated exactly if W(u, v)w(u, v) 
is everywhere nonzero. 

Bracewell and Roberts (1954) pointed out that, in principle, there are an infinite 
number of solutions to the convolution in Eq. (10.5), since one can add any arbitrary 
visibility values in the unsampled areas of the (u, v) plane. The Fourier transform 
of these added values constitutes an invisible distribution that cannot be detected 


10.2 Derivation of Intensity from Visibility 491 


by any instrument with corresponding zero areas in the transfer function. It may 
be argued that in interpreting observations from any radio telescope, one should 
maintain only zeros in the unmeasured regions of spectral sensitivity, to avoid 
arbitrarily generating information. On the other hand, the zeros are themselves 
arbitrary values, some of which are certainly wrong. What is wanted is a procedure 
that allows the visibility at the unmeasured points to take values consistent with 
the most reasonable or likely intensity distribution, while minimizing the addition 
of arbitrary detail. Positivity of intensity and limitation of size of the angular 
structure of a source are expected characteristics that can be introduced into the 
imaging process. Image restoration techniques that implicitly generate nonzero 
visibility values at unmeasured (u, v) points include CLEAN, maximum entropy, 
and compressed sensing, which are discussed in Chap. 11. 


10.2.2 Weighting of the Visibility Data 


To obtain the best signal-to-noise ratio in the summation of measurements that 
contain Gaussian noise, the individual data values should be weighted inversely as 
their variances. The same is true for the combination of sinusoidal components in 
an image of a source, the amplitudes of which are proportional to the corresponding 
visibility points. Thus, for the best signal-to-noise ratio, the weights w; in Eq (10.7) 
should be inversely proportional to the variances. If the data are obtained with a 
uniform array of antennas and receivers, and the averaging time is the same for 
all data points, then the variances should all be the same, and maximum signal- 
to-noise ratio is obtained by including all measurements with the same weight. 
This is known as natural weighting. For many arrays, natural weighting results in a 
poor beam shape with wide skirts because the shorter spacings are overemphasized. 
Thus, the usual approach is to include in the weighting a factor that is inversely 
related to the area density of the data in the (u, v) plane. The area density po (u, v) 
can be defined such that the number of points in the range u + idu, v + idv is 
Po (u, v)dudv (Thompson and Bracewell 1974). Although ps at any given point 
depends on the size of the increments du and dv, it is usually possible to specify the 
variation of relative density and correct for it satisfactorily. As a simple example, in 
the observation of a high-declination source with an east-west array in which the 
antenna spacings are nonredundant integral multiples of a unit value, the visibility 
points lie on concentric circles, as in Fig. 10.1. Then, if the visibility is measured 
at uniform increments in hour angle, the area density at any ring is inversely 
proportional to the radius of the ring. With w(u, v) proportional to 1/ps(u, v), the 
effective density of the data is uniform within a circle of radius Umax determined by 
the maximum spacing. The beam then closely approximates the Fourier transform 
of a circular disk function, which, normalized to unity at the maximum, is given by 


Ji (27 lumax) 


T lumax 


, (10.8) 
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Fig. 10.1 Transfer function 
(spacing loci) in the (u, v) 
plane for observations of a 
high-declination source using 
an east—west array with 
uniform increments in 
antenna spacing. The points 
indicate visibility 
measurements, and their 

(u, v) positions reflected 


through the origin, for € 

uniform intervals of time. The 

angle ¢ indicates data for a 

specific hour angle. If the Radius = umax 
visibility values are weighted 

in proportion to the radii of 


WV 


the loci, the density of the 
visibility data is effectively 
uniform out to a radius Umax- 


where Jı is the Bessel function of the first kind and first order. 2J; (x)/x is called 
a jinc function, by analogy to a sinc function. The full width of the beam at half- 
maximum (FWHM) is 0.705 w7},, and the first sidelobe response is 13.2% of the 
main beam.! Similarly, if the effective density of measurements is uniform within 
a rectangular area of dimensions 2Umax X Umax, the synthesized beam is closely 


approximated by 


sin(27tUmax!) _ Sin(27 Vmax) (10.9) 


27 Umax! 27 Vmax 


This beam is not circularly symmetrical, and the first sidelobe has a maximum value 
of 22% in the east-west and north-south directions through the beam center. 

With uniform weighting, the strong, near-in sidelobes (close to the main beam) 
in Fig. 10.2 obscure low-level detail and thereby reduce the range of intensity 
levels that can be reliably measured. The near-in sidelobes of the functions in 
expressions (10.8) and (10.9) can be reduced at the expense of some increase in the 
width of the synthesized beam by introducing a Gaussian or similar taper into the 
weighting function. The effect of such tapering of the visibility is shown in Fig. 10.2. 
The taper can be specified in terms of the amplitude of the tapering function at 
a distance Umax from the (u, v) origin; a taper to ~ —13 dB of the central value 


'This synthesized response should not be confused with the power pattern of a uniformly illumi- 
nated antenna with circular aperture of radius r, which is proportional to [J,(2mrl/A)/(arrl/A)]? 
and has a full width at half-maximum of 0.514A/r, first null at 0.610A/r, and first sidelobe of 
1.7%. The antenna pattern is proportional to the Fourier transform of the autocorrelation function 
of a uniform circular aperture. 
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(a) 1.0 ) 1.0 
— — No taper 
= —— Gaussian taper to 30% 

Gaussian taper to 10% 


0.5 \ 0.5 \ 


Normalized response 
Normalized response 


lumax luma 


Fig. 10.2 Examples of synthesized beam profiles. Curves for no taper correspond to a visibility 
distribution that is uniform within (a) a rectangular area of width 2umax, and (b) a circular area 
of diameter 2umax. For no taper, the responses correspond to expression (10.9) for (a) and (10.8) 
for (b). The effects of Gaussian tapers that reduce the visibility at the edge of the distribution to 
30% and to 10% are also shown. Note the difference in the ordinate scales. 


is commonly used. With such a taper, the weighting w(u, v) is the product of two 
functions: w,,(u, v), the weighting required to obtain uniform effective density, and 
w,(u, v), the tapering function. Thus, the synthesized beam is the Fourier transform 
of W (u, v)w,(u, v)w;(u, v): 


bo(l, m) = W(L, m) * * Wu (l, m) * * W;(l, m) , (10.10) 


where the bar denotes a Fourier transform. The Fourier transform of W (u, v)w,(u, v) 
is simply the beam obtained with uniform effective density, for example, as in 
expressions (10.8) or (10.9). If w,(u, v) is a two-dimensional Gaussian function, 
its Fourier transform is also a Gaussian. Thus, the sidelobe reduction results from 
convolution with a Gaussian in the (l,m) domain. The variances of functions are 
additive under convolution [see, e.g., Bracewell (2000)], so the beam obtained by 
convolution with W, is broader than that with no tapering, as is evident in Fig. 10.2. 

An interesting property of the uniform weighting is that it minimizes the mean- 
squared deviation of the resulting intensity from the true intensity, within the 
constraint that unmeasured visibility values remain zero. This can be understood as 
follows. Since the true intensity distribution /(/,m) and the true visibility function 
V(u, v) are a Fourier pair, and the weighted measured visibility and the derived 
intensity Jo(/,m) are a Fourier pair, it follows that the differences between these 
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quantities in the two domains are also a Fourier pair, to which we can apply 
Parseval’s theorem. Recall that W(u, v) is the transfer function, w,(u,v) is the 
weighting required to obtain effective uniform density of data in the (u, v) plane, 
and w,(u, v) is an applied taper. Thus, we can write 


J / |V(u, v) — V(u, v)W(u, v)w,(u, v)w(u, v) dudv 


+f f IV(u, v)|° dudv 


= f T I(l, m) — Io(l, m)|? dldm . (10.11) 


The first and second lines of Eq. (10.11) represent the measured and unmeasured 
areas of the (u, v) plane, respectively. In the measured area, W (u, v)w,(u,v) = 1. 
For the case of uniform weighting, w, = 1, so the integral on the first line is zero. 
This condition minimizes the squared difference between the true and observed 
intensity distributions on the third line. If /(/,m) is an unresolved point source, 
then Jo(/,m) is equal to the synthesized beam. The uniform weighting minimizes 
the squared difference, over 4x steradians, between the synthesized beam and the 
response to a point source as it would be observed with unlimited (u, v) coverage. 
In this sense, it is sometimes said that uniform weighting minimizes the sidelobes 
of the synthesized beam. However, as shown in Fig. 10.2, a Gaussian taper reduces 
the sidelobes outside of the main beam at the expense of widening the beam. Images 
derived from visibility data that are uniformly weighted within the measured area of 
the (u, v) plane have been referred to as the principal solution or principal response 
(Bracewell and Roberts 1954). The related process of reducing the sidelobe response 
in optical imaging is called apodization, for which there is an extensive literature; 
see, for example, Jacquinot and Roizen-Dossier (1964) and Slepian (1965). 


10.2.2.1 Robust Weighting 


With large arrays, the visibility data must be interpolated onto a uniform grid 
as described in Sect. 5.2 in order to make computations tractable. The simplest 
approach is called cell averaging, where each data point is associated with the (u, v) 
grid point nearest to it. The number of points averaged in a cell will decrease with 
increasing (u, v) distance, and many cells will have zero entries. Thus, the variance 
of the visibility estimates will vary considerably over the (u,v) plane. A conflict 
arises between the goal of forming a synthesized beam that is narrow and has low 
sidelobes and achieving the optimum sensitivity for the detection of weak sources. 
The best strategy for detecting a weak point source in the field is to use natural 
weighting, i.e., performing the image transform with variance weighting. On the 
other hand, if the signal-to-noise ratio is high, an image with better resolution and 
lower sidelobes can be obtained with uniform weighting. 
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Briggs (1995) introduced a logarithmic parametrized scheme that allows a 
continuous variation in weighting between uniform and variance weighting. The 
process is called robust weighting. The weighting of cell (i, k) in the (u, v) plane 
whose visibility has an rms error of oj is specified as 


1 


Wik = >—S > (10.12) 
S? + og 
where S is a parameter defined by 
5 x 1078)? 
S= CANTI . (10.13) 
w 


R is the robustness factor, and w is the average variance weighting factor over the ne 
cells in the image, 


— 1l 1 
w=—) —. (10.14) 


The nominal range of R is —2 to 2. R = 2 makes S very small with respect to w so 


that the weighting approaches natural weighting, whereas R = —2 makes S large 
with respect to w so that the weighting approaches the uniform weighting. R = 0 
produces an rms that is midway between the values for R = —2 and 2. R is called 


the robustness factor because as it increases, the image is more immune to errors in 
calibration or errors due to radio frequency interference, because the effect of a bad 
point in a cell with few data points is deemphasized as R increases. An example of 
how the synthesized beamwidth and rms noise vary with R is shown in Fig. 10.3. In 
the vicinity of R = 0, which is the normal default value, the beamwidth and rms 
noise are most sensitive to changes in R. For the example shown in Fig. 10.3, the 
beamwidth increases by 5%, and the rms noise decreases by 45% as R increases 
from —0.5 to 0.5. For inhomogeneous arrays such as those used in VLBI, the gain in 
sensitivity can increase markedly for little increase in beamwidth. 


10.2.3 Imaging by Discrete Fourier Transformation 


The speed of the fast algorithm for the discrete Fourier transform (FFT), briefly 
discussed in Sect. 5.2, is a major advantage in computing large images. However, the 
use of the FFT introduces two complications in addition to those discussed for the 
direct transform: (1) the necessity to evaluate the visibility at points on a rectangular 
grid and (2) the resulting possibility of aliasing of parts of the image from outside the 
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Fig. 10.3 Synthesized 1.15 
beamwidth vs. normalized +45 
rms noise level in an image 
for robustness factor R 
ranging from —2 to 2. The 
calculations are for the source 
1987A (Dec. = —69°) 
observed with two tracks of 
the Australia Telescope 
(configurations 6A and 6C) of 
about 7-h duration each. 
Adapted from Briggs (1995). 
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synthesized field. The evaluation at the grid points is often referred to as gridding. 
The output of such a process can be represented by the following expression: 


w(u, v) > 
AuAv 


(+, >) {C(u, v) * x [W(u, vyV(u, v)]} . (10.15) 


Here the visibility V (u, v), measured at the points denoted by the transfer function 
W(u, v), is convolved with a function C(u, v) to produce a continuous visibility 
distribution. This is then resampled at points in a rectangular grid with incremental 
spacings Au and Av. This process is sometimes referred to as convolutional 
gridding. The resampling is here represented by the two-dimensional shah function 
2I (Bracewell 1956b), defined by 


CO [o,) 
2 u v 2 ” 
m(—. =) = AuAv È Y 28(u—iAu, v—kAv) , (10.16) 
i=—00 k=—00 

where 76 is the two-dimensional delta function. The weighting to optimize the beam 
is applied to the resampled data. Although this process is described mathematically 
in terms of convolution and resampling, in practice the convolution is evaluated 
only at the grid points. The Fourier transform of (10.15) represents the measured 
intensity: 


Imeas (l,m) = I(l Au, mAv) * * W(l, m) * * {C(,m) [WL m) * * 10, m) |} . 
(10.17) 


As a result of the Fourier transformation, the intensity function /(/, m) is convolved 
with the Fourier transform of the transfer function; multiplied by C(/,m), which 
is the Fourier transform of the convolving function; and then convolved with the 
Fourier transforms of the weighting and resampling functions. This last convolution 


causes the whole image to be replicated at intervals Au! in / and Av~! in m. 
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These intervals are equal to the dimensions of the image in the (l,m) plane; that 
is, Au“! = MAI and Av™! = NAm, for an M x N point array. The function 
C(l, m) takes the form of a taper applied to the image, and if this function does not 
vary greatly on the scale of the width of w(/, m), which is usually the case for large 
images, then w(/, m) in Eq. (10.17) can be convolved directly with W(/, m)**I(l, m), 
and Eq. (10.17) becomes 


Imeas(1,m) ~ IÇ Au, m Av) * * {C(1,m) [I(1,m) * * bo(l, m)]} , (10.18) 


where the synthesized beam bo(/, m) enters through the relationship in Eq. (10.6). 
Comparison with Eq. (10.5) shows that the effect of the gridding and resampling 
is to multiply the image by C(/, m) and replicate it. This replication introduces the 
aliasing. 

Returning to the estimation of the visibility at the grid points, we might perhaps 
expect the best technique to be some form of exact interpolation so that the resulting 
values are equal to those that would be obtained by measurement at the grid points. 
A method of this type has been described by Thompson and Bracewell (1974). 
However, the problem of aliasing remains, and the most effective way to deal with 
this is to convolve the data in the (u,v) plane with the Fourier transform of a 
function that, in the (/,m) plane, varies very little over the image and then falls 
off rapidly at the image edges. We therefore look for a convolving function C(u, v) 
for which the Fourier transform C(/, m) has these properties. An ideal function with 
infinitely sharp cutoff at the field edges would completely eliminate the aliasing 
since there would be no overlap of the replicated images. Unfortunately, this ideal 
is not practical because the required convolving function is not bounded in the 
(u, v) plane. Nevertheless, a very worthwhile degree of suppression of the aliasing 
is possible with a careful choice of functions. A common and convenient practice 
is to combine both the gridding, and the convolution to minimize aliasing, into a 
single operation. Note, however, that at the (u, v) points at which the measurements 
are made, the function C(u,v) x x [W(u, v)V (u, v)], in general, is not equal to 
the measured visibility V(u, v). Thus, the gridding process cannot precisely be 
described as interpolation. Also, because of the convolution, the sampled points 
represent averages of the visibility local to the grid points, rather than samples of 
the visibility function. Finally, note also that although convolution is effective in 
suppressing artifacts that result from gridding of the data, it does not reduce sidelobe 
or ringlobe responses to sources located outside the area of the image. 


10.2.4 Convolving Functions and Aliasing 


From the foregoing discussion, we can conclude that the point of principal concern 
in the use of the FFT is the choice of convolving function. A detailed discussion of 
convolving functions is given by Schwab (1984). It is convenient to consider those 
that are separable into one-dimensional functions of the same form for u and v, that 


498 10 Calibration and Imaging 


C(u, v) = C\(w)Ci(v) . (10.19) 


We therefore discuss some examples of the function C1. 


Rectangular Function. This function is the one used in cell averaging discussed in 
Sect. 5.2.2. It can be written 


Ci) = (Au) 0 (>) : (10.20) 


where II is the unit rectangle function defined by 


1, |x| < + 
N(x) = =? 10.21 
o) P k> t. ee 
The Fourier transform of Cj (u) is 
= sin(z Aul 
aoe | (10.22) 
x Aul 


At the edge of the synthesized field, ] = (2Au)~! and Cı (1/24u) = 2/z. The 
image is tapered by a sinc-function profile in the / and m directions and a sinc- 
squared profile along the diagonals. Equation (10.22) is plotted in Fig. 10.4, and 
the value at the first maximum outside the edge of the image is 0.22 of the value 
at the image center. The effect of aliasing is shown more directly in Fig. 10.5a, 
which is a plot of Cı (D/C [f(D], where f(D is the value of / within the image [i.e., 
|f(D| < (2Au)~'] at which the alias of a feature of 1 would appear. This quantity 
gives the relative response to an aliased feature in an image that has been corrected 
for the taper imposed by Cı (1). It is clear that simple averaging of points within a 
rectangular cell performs poorly in suppressing aliasing. 


Gaussian Function. Here we have 


1 —(u/ «Au 
C(u) = ade (u/aAu)? (10.23) 


and 
Ti (D = eras? (10.24) 


The value of the constant œ can be chosen to vary the widths of the functions as 
desired. If œ is too small, Cı (u) will be too narrow, and only visibility measurements 
that are close to grid points will be used effectively in the imaging. If œ is too 
large, the function C (u) will taper the resulting image too severely. The Gaussian 
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Convolving function 


Rectangular 
——— Gaussian 
Gaussian-sinc 


Magnitude of C;(/) 
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Fig. 10.4 Three examples of the tapering function C,(J), which is the Fourier transform of the 
convolving function C;(u). For the Gaussian convolving function, œ = 0.75. For the Gaussian- 
sinc convolving function, a; = 1.55, @ = 2.52, and beyond the fourth subsidiary maximum, 
only the envelope of the maxima is shown. On the abscissa scale, the center of the image is at zero 
and the edge at 1.0. The data for the Gaussian-sinc function were computed by F. R. Schwab. 


convolving function was used in the early years of the Westerbork array with a = 
2,/in4/x = 0.750 (Brouw 1971). The value of the factor e~“/24 in C(u) is 
then equal to 0.41 for a point on a diagonal in the (u, v) plane midway between two 
grid points. Thus, all measured points enter into the image with significant weights, 
and at the edge of the image, the tapering factor C} = i A curve for the Gaussian 
function is shown in Fig. 10.4. 


Gaussian-Sinc Function. The ideal form for the image tapering function Cı (J) 
would be a rectangle, which corresponds to convolution with a sinc function, as 
in Eq. (10.22). However, the envelope of a sinc function falls to zero slowly as 
its argument increases, and the computation required for the convolution becomes 
large. Truncation of the sinc function is undesirable because in the / domain, 
the desired rectangular function is convolved with the Fourier transform of the 
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Fig. 10.5 Logarithmic plot of the factor by which the amplitudes of structures outside the image 
are multiplied when aliased into the image. On the abscissa scale, 1.0 is the edge of the image 
and 2, 4, 6, ..., are the centers of the adjacent replications. (a) Aliasing factor for a rectangular 
convolving function of width equal to Au (cell averaging). (b) Aliasing factor for a Gaussian-sinc 
convolving function with the optimized parameters given in the text. The broken line indicates the 
envelope of the maxima. Data computed by F. R. Schwab. 


truncation function, and this destroys the sharp cutoff at the edges of the image. 
A better procedure is to multiply the sinc function with a Gaussian, which gives 


sinÇru/æ1 Au) -quan An)? 
TU 


Cı (u) = (10.25) 


and 
TiD = T (o Aul) x | Vro Aue aa | , (10.26) 


Good performance is obtained with a; = 1.55 and a2 = 2.52, with the convolution 
extending over an area about 64u in width. Corresponding curves for C,(/) and 
the resulting aliasing are given in Figs. 10.4 and 10.5b. This convolving function is 
much better than either of the two previous examples. 


Spheroidal Functions. Various other functions can be found that have the features 
desirable for convolution. As a measure of the effectiveness of the suppression of 
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aliasing, (Brouw 1975) has suggested the following quantity: 


J [ee [C(, m) | dldm 


-o s = z > (10.27) 
SZo S oo [EU m) didm 

which shows the fraction of the integrated squared amplitude of the tapering 
function that falls within the image. Maximization of (10.27) provides a criterion 
for choosing a convolving function. This approach led to consideration of the 
prolate spheroidal wave functions [see, e.g., Slepian and Pollak (1961)] and the 
spheroidal functions (Rhodes 1970). Schwab (1984) found that among functions 
investigated, the latter provide the best approach to an optimum convolving function. 
The spheroidal functions are solutions to certain differential equations and are not 
expressible in simple analytic form. In applying such functions for convolution of 
visibility data, they are computed in advance to provide a look-up table. Comparison 
of some functions of this type with the Gaussian-sinc function shows that the 
aliasing factor C,(J)/C; [f(D] falls off about as rapidly from the center to the edge 
of the image, but as / increases beyond the edge of the image, it reaches values 
an order of magnitude or more lower than those for the Gaussian-sinc function 
Briggs et al. (1999). Computational capacity complicates the choice of the optimal 
function, since it limits the area of the (u, v) plane over which the convolution can 
be performed. Commonly, this area is six to eight grid cells wide and centered on the 
point to be interpolated. Roundoff errors in the Fourier transform are amplified in 
the removal of the tapering function and may limit the allowable taper at the edges 
of the image. 


10.2.5 Aliasing and the Signal-to-Noise Ratio 


Features aliased into an image from outside the boundary include not only the 
images of features on the sky but also the random variations resulting from the 
system noise. If we consider a direct Fourier transform of the noise component 
of the measured visibility, it is clear from Eq. (10.7) that for any point (l, m), the 
visibility data are weighted by complex exponential factors, all of which have the 
same modulus. Since the noise is independent at each data point in the (u, v) plane, 
the variance of the noise in the (/,m) plane is statistically constant in all parts of 
the image. If the FFT is used, however, the rms noise level across the image is 
multiplied by the function C(/,m), and details beyond the image edge are aliased 
into the image. Note that the noise contributions combine additively in the variance. 
Thus, in one dimension, the noise variance as a function of / is proportional to 


W(/Au) * |C D|. (10.28) 
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Fig. 10.6 Effect of aliasing on the variance of the noise across an image. The abscissa in each 
case is / in units of half the image width; the image center is at 0, the edge at 1.0, and the center of 
the adjacent replication at 2.0. (a) Solid curve shows the taper for a Gaussian convolving function 
Cı, and dashed curves show the effect of aliasing. (b) Variance of the noise including aliased 
component after correction for taper Cı. Adapted from Napier and Crane (1982) [see also Crane 
and Napier (1989)]. 


The replication resulting from the FFT can also be written in terms of a summation, 
and the variance of the noise at a point / within the image is then proportional to 


co 


Y Oai. (10.29) 


i=—00 


Usually C, (/ decreases sufficiently with / that only the noise from the adjacent 
replication of the image makes a serious contribution through aliasing. This 
contribution is greatest near the edge of the image, as shown in Fig. 10.6. 

If the convolving function is the Gaussian-sinc type, we see from Fig. 10.5b 
that, except for values of 2Au/ between 1.0 and 1.1, aliased features are reduced 
in amplitude by a factor < 107', and in the square of the amplitude by < 10°. 
Thus, there is no significant increase in the noise level as a result of aliasing, except 
in a narrow zone at the edge of the image. 

At the other extreme, the aliasing is most serious in the case of cell averaging, 
for which C; (u) is the sinc function given by Eq. (10.22). Expression (10.29) then 
becomes 

lee) ry) 5 
> — = =i; (10.30) 
ul+ i 
1=—0CO 
which indicates that the aliasing exactly cancels the taper, and the variance of the 
noise is constant with /, that is, before any correction for tapering of the astronomical 
features in the image is applied. (This result could also be deduced from the fact 
that in cell averaging, each visibility measurement contributes to one grid point 
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only, and the noise components of the visibility at the grid points are therefore 
independent.) However, the intensity distribution of the sky within the field being 
imaged is tapered by the function C (J), and correction for this taper then causes 
the noise to increase toward the edges of the image. For the sinc-function taper, the 
noise is increased by a factor of 2/2 at the edge of the image on the / and m axes 
and by (7/2)? at the corners. At the center of the image, the aliased contribution 
originates at points for which 2 Au/ is an even integer in the plots in Fig. 10.5, and 
in both cases shown, the aliasing factor C;(J)/C, [f(D] drops to a very low value. 
With any of the convolving functions that we have considered, there is no significant 
increase in the noise at the center of the image, and the signal-to-noise ratio for a 
source at that point is determined by the factors discussed in Sect. 6.2. 


10.2.6 Wide-Field Imaging 


To take full advantage of large new instruments with wide bandwidths, high 
sensitivity, and full polarization responses, it is necessary to measure the radio sky 
down to the level of the background radiation from the Epoch of Reionization (EoR) 
and to be able to separate out components from individual radio sources that overlie 
the background. The width of the synthesized field may be much greater than a few 
degrees, so the image is no longer the Fourier transform of the visibility function. 
The basic requirement for such an analysis is an equation for the visibility values 
that would be measured for a given brightness distribution, taking account of all 
details of the locations and characteristics of the individual antennas, the path of 
the incoming radiation through the Earth’s atmosphere including the ionosphere, 
the atmospheric transmission, etc. This is the interferometer measurement equation 
introduced in Sect. 4.8. In its basic form, it describes the response of a single 
pair of antennas and is thus applicable to any specified system of antennas and 
any brightness distribution, to provide values of the visibility for each antenna 
pair. It includes direction-dependent effects such as the primary beam patterns of 
the antennas, polarization effects that vary with the alignment of the polarization 
of the source relative to that of the antennas, and the baselines of the antenna 
pairs. These must be accounted for without small-field or other approximations. 
Direction-independent effects such as large-scale propagation in the atmosphere and 
the ionosphere, and the response of the receiving system, can also be included. 

The reverse operation, i.e., the calculation of the optimum estimate of the image 
from the measured visibility values, is less simple. Taking the Fourier transform 
of the observed visibility function usually produces a brightness function with 
physically distorted features such as negative brightness values in some places. 
However, starting with a simple but physically realistic model for the brightness, 
the measurement equation can accurately provide the corresponding visibility values 
that would be observed. By comparing these with the observed values, it is possible 
to adjust the brightness model toward the observed distribution and, by iterative 
repetition of this process, to arrive at an image that agrees with the visibility 
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measurements to within the uncertainties resulting from the noise. An example of 
this process of making an image of a radio source is described by Rau et al. (2009), 
who use an iterative Newton—Raphson approach, as follows. 


1. Calibrate the interferometer responses by making observations of sources with 
known position and structure. This includes measurement of both parallel and 
cross polarizations (for circular or linear polarization, whichever is used). 

2. Make observations of the area of sky under investigation and, using the calibra- 
tion data from (1), determine the (complex) visibility function for points in a 
rectangular grid in the (u, v) plane. 

3. Using the measurement equation, calculate visibility values for a model source 
centered in the area in (2), for the (u,v) values of the gridded visibility 
measurements in (2). The model can make use of any prior information on 
the source under observation, but otherwise a point source model will generally 
suffice. 

4. Subtract the calculated visibilities for the model source from the corresponding 
observed values in (2), and take the Fourier transform of the difference to provide 
a brightness function that represents the difference between the sky and the 
model. 

5. Use the brightness function from (4) to improve the model brightness function, 
i.e., to make it closer to the visibilities measured in (2). To do this, add a fraction 
y of the brightness function from (4), to the model, to provide a new model 
source. y is the loop gain in the process. 

6. Calculate the visibility values (Vm;) for the improved source model from (5), 
and if they are sufficiently close to the observed visibilities (Vo;), go to (7). 
Otherwise, return to (4) with the improved model from (5). Comparison of the 
observed and model visibilities involves computation of y7 = >> jl(Vo; — Vj) 
(Vo; — Vm;)*], which is minimized by the iterative process. 

7. Take the residual differences between the observed and model visibility values 
in (6), Fourier transform them to brightness, and add them to the model values 
from (6). This step ensures that the Fourier transform of the final model is equal 
to the observed visibilities. 


The number of iterations (from step 6 back to step 4) required varies inversely 
with the value of y in step 5. A value of y = 0.5 or less allows the optimum solution 
to be approached more accurately by using smaller steps. The choice of the model 
source in step 3 is not critical. For example, if the source is actually a wide one and 
a point source is used as the model, then in step 3, the model visibility values will 
have significant values over a much wider range of (u, v) spacings than that of the 
measured visibilities. However, in step 4, the fraction y of the excess visibilities is 
subtracted, and the model sequentially moves toward the measured visibility, within 
the limits of the noise. Obtaining an image that is a realistic model of the sky, and is 
in agreement with the measured visibility, is the essential goal in synthesis imaging. 
This iterative procedure with y? minimization illustrates the basic approach to a 
number of the processes used in imaging. 
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10.3 Closure Relationships 


Closure effects are relationships between visibility values for baselines that form 
a closed figure, for example, a triangle or quadrilateral with the antennas at the 
vertices. As shown by Eqs. (7.37) and (7.38), the correlator output for antenna pair 
(m,n) can be written as 


mn = Gmn Vm = Bubs Viren , (10.31) 


where G,nn is the complex gain for the antenna pair, and g,, and g, are gain factors for 
the individual antennas. We ignore any gain terms that do not factor into the terms 
for individual antennas (see Sect. 7.3.3), i.e., those that are baseline dependent. 

Considering first the phase relationships, we represent the arguments of the 
exponential terms Of fin, 8m, 8n, and Vian BY Omn, Pm» Pn» and hymn, respectively. 
Thus, we can write 


dma = Pm — Pn + Pom + (10.32) 


For three antennas m, n, and p, the phase closure relationship is 


Pemp = mn + Pnp F Pom 
= Pm = Pn F umn 


(10.33) 
T Gn = $p T Ponp 
+ Pp = Om + Pupm 
or 
Pennp = umn a Ponp F Pvpm : (10.34) 


The antenna gain terms, g,, and so on, contain the effects of the atmospheric paths 
to the antennas as well as instrumental effects, and since these terms do not appear 
in Eq. (10.34), it is evident that the combination of the three correlator output phases 
constitutes an observable quantity that depends only on the phase of the visibility. 
This property of the phase closure relationships was first recognized and used by 
Jennison (1958). 

If a point source is observed, then the visibility phases are all zero, and, in the 
absence of receiver noise, the closure phase is also zero. Note that if the rms phase 
noise on each baseline is o, the rms noise in the closure phase is V30. 

To help visualize the phase closure concept, consider three stations of an 
array observing a point source, as shown in Fig. 10.7. We depict the origin of 
the instrumental phase terms associated with each station as being caused by 
atmospheric delay along each line of sight. The total visibility phase on each 
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Fig. 10.7 A three-baseline 
triangle for antennas m, n, and 
p. s is the unit vector in the 
direction of the source. The 
phases of the antenna-based 
gain factors are represented 
by atmospheric cloudlets that 
cause excess phase shifts of 
dm, On, and dp, respectively. 


baseline is dy = =p + s; hence, the closure phase is 


a (Din + Dip + Dym) + 8 = 0 (10.35) 


Pemp = py 


because the sum of the baselines around a triangle is identically zero. This shows 
that the closure phase for a point source is zero, even if it is not at the phase-tracking 
center or if the station coordinates have errors. A corollary of this result is that the 
position of a source cannot be deduced from closure phase measurements alone. 

If we have ng antennas and we measure the correlation of all pairs, the number 
of independent phase closure relationships is equal to the number of correlator 
output phases less the number of unknown instrumental phases, one of which can be 
arbitrarily chosen. If there are no redundant spacings, then each closure relationship 
provides different information on the source structure. The number of phase closure 
relationships is 


sale Sy (na -1)= Tou = 1)(na — 2). (10.36) 


It is often important to be able to identify which set of closure triangles can 
be considered to be independent. This is necessary if closure phases are to be 
used directly in model fits. Combinatorial mathematics is useful in this regard. The 
question of how many triangles can be formed among na antennas can be rephrased 
as: Among Ma objects, how many unique ways can three of them be chosen without 
replacement or regard to order? The answer is the binomial coefficient 


[na Na! _ halna — 1)(Ma — 2) 
npr = ( ) = qua JB! NE Fo i (10.37) 
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Fig. 10.8 The four closure 
triangles among four 

antennas. The three triangles 
involving the reference 

antenna, denoted by 1, are 
independent. The phase 

closure on the fourth triangle 
linking antennas m, n, and p 1 
can be derived from the three 
independent phase closures. 


other 
antennas 
p 


Similarly, the number of baselines, np, is 


Na\ _ nalna — 1) 
( ) a (10.38) 


A set of independent triangles can be found by the following process. Select one 
antenna as a reference, as shown in Fig. 10.8. The set of independent triangles is all 
of those that include the reference antenna. The nonindependent triangles are the 
ones that do not involve the reference antenna, taken to be antenna 1, i.e., 


Pemnp = inn ay Pnp F Ppm ’ (10.39) 


where none of n, m, and p are not equal to one. The sum of closure phases 


Peinm = Pin = Pam +F mı 
Peimp = Pim + Pmp + Hpi (10.40) 
Pein = Pip + Ppa F Pn 


is 
Pum + Pmp + pn > (10.41) 


since din = —dnt, Pin = —Pmı, and ġıp = —dp1. The number of independent 
closure triangles is thus given by 


na— 1 (na — 1)(na — 2) 
TRA ( i ) =e. (10.42) 
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Table 10.1 Baselines and phase closures for an array of na elements* 


Na np npr NP indep fp” 

2 1 0 0 0 

3 3 1 0.33 

4 6 4 3 0.50 

5 10 10 6 0.60 

8 28 56 21 0.75 

10 45 120 36 0.80 

27 351 2,925 325 0.93 

50 1,225 19,600 1,176 0.96 

100 4,950 161,700 4,851 0.98 


anp, = nalna — 1)/2. npr = Na(Ma — 1) (na — 2)/6. 
NP indep = (Na a 1) (na = 2)/2. fp = NP indep/Nb =1--. 
>See Fig. 11.4. 


in agreement with Eq. (10.36). The fraction of the phase information that can 
recovered from phase closures in an array is 


na — 1)(na— 2 Ng(Nqg — 1 2 
fp = Mpindep/Nb = (a n=?) mnt) =1- te (10.43) 


Representative numbers are given in Table 10.1. 
We now discuss the amplitude closure relations. An amplitude closure relation- 
ship involves four antenna pairs, for which four antennas m, n, p, and g are required: 


[Tima |Tpql — | Y mll Ypal ; (10.44) 
|Fmpl |Fnal IV mpl |V nal 

The proof of Eq. (10.44) is obtained by substituting terms of the form m83 V mn 
into the left side of Eq. (10.44), using Eq. (10.31). The moduli of the g terms 
then cancel out because the numerator and denominator both contain the product 
of the moduli of all four g terms. A total of six closure amplitudes can be 
formed. Three will be reciprocals of the other three and ignored. The basic 
three configurations are shown in Fig. 10.9. The product of these three closure 
amplitudes—|rmn| [Fpa] /|Fmpl ITng|> |Fmpl [Fna] (nal lTnp|s and Tima [Fpa] /|Fmal |Fnp|—is 
unity, so only two of them are independent. The number of independent amplitude 
closure relationships for na antennas with no redundant baselines is equal to the 
number of measured amplitudes, $na (na — 1), less the number of unknown antenna 
gain factors na, that is, 


1 1 
NAindep = grata —1)-nm= grea — 3). (10.45) 
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Fig. 10.9 The three closure amplitudes that can be formed among four antennas [see Eq. (10.34)]. 
(We have not included the trivially redundant reciprocal cases, i.e., solid band dotted lines 
interchanged.) In each case, the two visibility moduli that go in the numerator of the closure 
amplitude are shown by the solid lines, and the two that go in the denominator are shown by 
the dashed lines. The product of the three closure amplitudes is unity, so only two of the closure 
amplitudes are independent. 


The fraction of amplitude information that can be recovered from amplitude closures 
is 


n—3 


n=1" 


fa= (10.46) 
For early usage of the principle of taking ratios of observed visibility amplitudes 
to eliminate instrumental gains, see Smith (1952) and Twiss et al. (1960). The total 
number of closure quadrangles is 


nar = 6 a ; (10.47) 


which is on the order of n4. Systematic procedures can be devised to select an 
independent set. For a detailed analysis of amplitude closure structures, see Lannes 
(1991). 

Note that a fundamental requirement for the validity of the closure relationships 
is that at any instant, it must be possible to represent the effect of any signal 
path from the source to the correlator by a single complex gain factor. Thus, the 
effects of the atmosphere must be constant over the source under observation, that 
is, the angular width of the source should be no greater than the isoplanatic patch 
size for the atmosphere. The isoplanatic patch is the area of sky within which the 
path length for an incident wave remains constant to within a small fraction of 
a wavelength; see also Sect. 11.8.4. The size of the isoplanatic patch varies with 
frequency. At a few hundred megahertz or less, it is common to have more than one 
source within an antenna beam, and these may be separated sufficiently in angle 
that ionospheric conditions may be different for each one. The closure conditions 
will then be different for each source, and use of the closure principle then becomes 
more complicated than in the single-source case discussed above. 
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The closure relationships have proved to be very important in synthesis imaging. 
When applied to unresolved point sources, the phase closure should be zero and the 
amplitude closure unity. Thus, they are useful in checking the accuracy of calibration 
and examining instrumental effects. For resolved sources, they can be used as 
observables in situations in which direct calibration by observation of a calibration 
source is not practicable, as is sometimes the case in VLBI. Most importantly, they 
can be used to improve calibration accuracy for observations where high dynamic 
range is required. The amplitude closure relationships are less frequently used 
because it is generally easier to calibrate the visibility amplitudes than the phases. 
However, they provide a useful check in cases in which the amplitude is required 
with especially high accuracy [for examples, see Trotter et al. (1998); Bower et al. 
(2014), and Ortiz-Le6n et al. (2016)]. 


10.4 Visibility Model Fitting 


The fitting of simple intensity models to visibility data was practiced extensively in 
early radio interferometry, especially when the visibility phase was poorly calibrated 
or the data were not sufficiently complete to allow Fourier transformation. Examples 
of simple models are shown in Figs. 1.5, 1.10, and the Gaussian components in 
Fig. 1.14. 

Model fitting continues to be the only recourse for data interpretation in sparse 
VLBI arrays such as those used at short millimeter wavelengths [see, e.g., Doeleman 
et al. (2008)]. However, model fitting is very important even in large, well-sampled 
arrays that can generate high-quality images. These images are produced by a 
complex process that includes Fourier transformation of visibility data that have 
been interpolated onto a grid, followed by self-calibration and application of 
nonlinear deconvolution algorithms such as CLEAN, as described in Chap. 11. The 
noise in these images is correlated among pixels and can have poorly understood 
characteristics. Such images are not unique and can be considered to be models 
of the true brightness distributions. Extractions of source parameters in the image 
plane can therefore be characterized as “modeling the model.” 

In contrast, the fundamental data product of an array, the visibilities, has well- 
characterized noise properties, i.e., it is uncorrelated Gaussian noise with known 
variance. If the characteristics of the source emission structure are to be interpreted 
with a specific model in mind, the parameters of such a model can often best 
be obtained from direct analysis of the visibility data. Important examples of the 
application of model fitting include the cases of sources whose intensity decreases 
as a power law, as described in Sect. 10.4.4. In these cases, the proper estimate of 
the total flux density and other parameters from image plane analysis is difficult. 
Another application of visibility model fitting is in the determination of the changes 
in parameters of a source in which time-separated observations may not have 
identical (u, v) coverage. Fitting the same model to both data sets, but allowing 
the parameters of interest to vary, is likely to give the best evidence of change. An 
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interesting example is provided by Masson (1986) in a measurement of angular 
expansion of a compact planetary nebula. From several data sets obtained at 
different epochs, the image from the one with the best (u, v) coverage was used 
as a model to fit to the others, thereby avoiding direct comparison of images made 
with different synthesized beams. 

A useful discussion of the general principles of model fitting can be found in 
Pearson (1999). For the estimate of large numbers of parameters in a Bayesian 
framework, see Lochner et al. (2015). There are advantages for searching for 
transient sources in the (u, v) data (Trott et al. 2011). 


10.4.1 Basic Considerations for Simple Models 


We consider the case of the small field of view (/,m « 1, A(/,m) ~ 1), where the 
transform between image intensity and visibility given in Eqs. (3.7) and (3.10) can 
be written as 


co 

V(u,v) = i I(l, m) e7 m di dm (10.48) 
=p 
o0 s 

I(l, m) = J V(u, v) eP” dudy . (10.49) 
—oo 


A simple common source model is a Gaussian intensity distribution centered at 
position (lı, mı) with peak intensity Jọ and width parameter a: 


— (10.50) 


I(l, m) = I ep ea] y 


which has FWHM, 6g, of v8 In 2a. The corresponding model visibility distribution 
is 


V,,(u, v) = Soe 2r?a? (u? +v?)—j27 (uli +vm) , (10.51) 


where Sọ = 27xloa?, the total flux density. The visibility has real and imaginary 
components that are sinusoidal corrugations, the ridges of which are normal to the 
radius vector to the point (4, mı) in the image domain. These visibility components 
are modulated in amplitude by a Gaussian function centered on the (u, v) origin 
and of width inversely proportional to o. Examination of the visibility distribution 
can thus indicate the form and position of the main intensity components. For early 
discussions and examples of this type of model fitting, see, for example, Maltby and 
Moffet (1962), Fomalont (1968), and Fomalont and Wright (1974). Fitting the four 
parameters (Zo, a, lı, mı) in the image plane or (So, a, l, mı) in the visibility plane 
is a nonlinear process. It requires an initial guess for the parameters. The choice 
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of these initial parameters is more obvious in the image plane than in the visibility 
plane, but final analysis is best done in the visibility plane. 

To fit model parameters, it is necessary to choose a criterion for the goodness 
of fit. Since the real and imaginary components of the visibility usually have 
Gaussian noise, the optimum criterion from a maximum likelihood point of view 
(see Appendix 12.1) is the y? criterion, which minimizes the weighted mean- 
squared difference between the model and the data set of ng points, i.e., 


p = LV) — Vn i PIV v) Vn PT 49 599 


= a 
where V(u;, vi) are the measured visibilities, V,,(u;, Vi, p) are the model visibilities 
with n, parameter p, and the o;’s are the measurement errors. For a perfect fit, 
the expected minimum value of x? is ng — Np, and the standard deviation of y 

2(na — np). The reduced chi square, x2, which is y7/(ng — Np), Should be close 

to unity for a good fit. y7 > 1 indicates that the model is not correctly parametrized 
or that the estimates of errors are not correct. In any fitting procedures, the residuals, 
e., [V (ui, vi)— V m(ui, vi, P)|/o;, should be examined for any systematic deviations 
from a Gaussian probability distribution. Such deviations suggest that more or 
different parameters are required. If the deviations follow a Gaussian probability 
distribution, then the problem may be that values of o; are misestimated by a 
constant factor that can be chosen to make y? = 1. Another common defect is that 
the data have a noise floor. In this case, the o; terms can be replaced by of + o, 


where o represents a noise floor and OF is chosen so that y7 = 1. A o? > 0 has the 
effect of reducing the importance of measurements with low o;, and o > o; tends 
toward a solution that gives equal weight for all data regardless of 0;. 

Note that Eq. (10.52) can be written as 


2_ “ (Vp, = V mR)? + (V; _ Vay 
=) E * oa 
i=1 i 


(10.53) 


where Ver and V; are the real and imaginary parts of V, and Vmr and V,,; are 
the real and imaginary parts of Vm. The data to be fitted may consist of visibility 
amplitudes and closure phases. In this case, the x? can be written as 


na 2 Ne 
nas Ue ar [Vn] $e Ol Pme)? (10.54) 


i=1 o, i=1 Ci 


where ore and o2 are the measurement variances on the closure amplitudes and 
closure phases, respectively. In the strong signal case (see Sect. 9.3.3), 


2 2 2 
On, =o; and o7 = (+) + (=) + (=) ; (10.55) 
1 2 3 
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In cases of weaker signal, the application of Eq. (10.54) may not yield an optimum 
solution because the probability distributions for the closure and amplitude become 
non-Gaussian. In particular, the probability distribution of the closure amplitude 
becomes progressively more skewed as the signal-to-noise ratio (SNR) decreases. 

Examples of models fitted to visibility data sets with limited amounts of closure 
data can be found in Akiyama et al. (2015); Fish et al. (2016), and Lu et al. (2013). 

The computation challenge of finding the minimum value of y? can be daunting. 
A popular method that is straightforward to implement but that can require larg e 
computation resources is the Markov chain Monte Carlo (MCMC) algorithm based 
on Bayesian theory. It provides a way to systematically vary the parameters in search 
of a y? minimum. It also produces posterior probability functions for the parameters 
[see, e.g., Sivia (2006)]. 

There is an important relationship between the moments of the intensity distri- 
bution and the visibility. The zero-order moment is equal to the flux density S, the 
odd-order moments contribute to the imaginary components of the visibility, and 
the even-order moments contribute to the real part. If the source is symmetrical in /, 
the odd-order terms are zero. If, in addition, the source is only slightly resolved, the 
decrease in V results mainly from the second-moment term. Then the source can be 
represented by a symmetrical model with an appropriate second moment. 

For simplicity, consider the one-dimensional problem 


Viu) = f n edl, (10.56) 
where V: (u) = V (u, 0) and 
Lh) = f I(l,m)dm . (10.57) 


Each derivative of V; with respect to u introduces a factor of —j27rl, so that the nth 
derivative can be written as 


CO 
y” (u) = / (—j2nl)"h oP dl (10.58) 
—oo 
or 
Co 
VW" (0) = (—j2m)" Í "hDdl. (10.59) 
—oo 
The Taylor expansion of V; (u) is 


2 n 
Vi (u) = Vi (0) + Vi (0)u + VIO a eae vO m (10.60) 
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or 


Vi(u) = Mo + >> My" (10.61) 


n=1 


where 
oO 
M, = / "Dd. (10.62) 
—oo 


The Taylor expansion requires that the moments be finite. 


10.4.2 Examples of Parameter Fitting to Models 


The model most commonly encountered in interferometry is a simple Gaussian dis- 
tribution with unknown flux density, size, and position, as described by Eq. (10.51). 
The four model parameters, So, a, h, and mı can be estimated from standard 
procedures for nonlinear least-mean-squares analysis (Appendix 12.1). This anal- 
ysis requires initial guesses for the parameters. The model can be generalized to 
an elliptical Gaussian source described by major and minor axis diameters and a 
position angle (six-parameter fit). 

To gain an understanding of the accuracy to which parameters of a simple 
model can be deduced, consider a slightly resolved source having an azimuthally 
symmetric distribution of unknown position observed at a set of ng points with 
noise o. In the case of high SNR, we can analyze the visibility amplitude and phase 
separately. The model for the visibility phase and amplitude can be written 


@ = 2x (uih + vim) (10.63) 
IV| = So -be , (10.64) 


where q? = u? + v? and l, mı, and b are parameters to be determined. We further 
assume that m; is zero. 

A simulated data set is shown in Fig. 10.10. The models are linear in the param- 
eters l1, So, and b. These parameters can be estimated via the usual linear solutions 
to the y? minimization equations for phase and amplitude [see Appendix 12.1 or 
Bevington and Robinson (1992)]. The estimate of l is 


1 nd 
Fa 2 pitti! G, 
_ 20 = 


Nd 


2 
Salle 


i=1 


l , (10.65) 
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Fig. 10.10 Fringe visibility model and data for a slightly resolved, azimuthally symmetric source. 
(left) Visibility amplitude quadratically declining, indicative of the source being resolved; (right) 
visibility phase, indicative of a position offset. 


where œ; œ 0;/|V|; and o; is defined in Eq. (6.50). We assume that all antennas 
have the same sensitivity, so o; = o, and that |V| ~ So, so that og, is approximately 
constant. In this case, 


o/S 
on = i “as (10.66) 
27 Xog 
i=1 
If the data are uniformly spaced at intervals Au, i.e., u; = iAu, then Ya = 


(Au)? 0? = (Au)?na(na + 1)(2na + 1)/6 ~ (Au)?n3/3 for ng > 1. Hence, 


1 30 1 
a , 10.67 
7h 27r \ na So Umax ( ) 
where Umax = NgAu, or 
0.3 À 
2 (10.68) 


On X — 5 
s/d So Dmax 


where Dmax = AUmax. This formula is close to the one used in astrometry for direct 
image fitting [see Eq. (12.16)]. 
The estimates of Sp and b, along with their errors, os, and op, are 


Nd Nd 


Nd Nd 
$= 5| oat DIV - Yi D Via (10.69) 


i=1 i=1 i=1 i=1 
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of =~ yal (10.70) 


1 Nd nd 
b= >| na IYN; | — Di Sm (10.71) 


; (10.72) 


where A = n}. qe — © PY. If the data are uniformly spaced at intervals of 
4q from 0 to qmax = naAq, then, if we use the approximations ¥` qf ~ n3/5 and 
ig 2015/3, forna > 1, 


(10.73) 


5 1 
w2 — Ss 3 (10.74) 
Nd max 


For a Gaussian source distribution, the Taylor expansion of the visibility function 
in Eq. (10.51) (see Table 10.2) gives b = 2n7a*So. Since 6g, the FWHM angular 
diameter, is v8 In 2a, we obtain 


g — [402 b 1/2 Hons 
oo m2 So ` ` 


and 


The uncertainty in 0G, Ogg, for the case ogg K 0g, will be 


4In2 /50 1 
x — >>. 10.76 
Toa 272 Nq So O6Qanx ( ) 
The minimum source size that can actually be measured at the 1-sigma error level 
is about Omin ~ Ogg ~ Og, which is 


= 06 à 
ipea SR Dmax i 


where the signal-to-noise ratio Rsn = So yna /o and Dmax = Aqmax. A more precise 
and general analysis for various levels of statistical significance is given by Martí- 
Vidal et al. (2012). 

Note that position and angular parameters can be estimated to an accuracy limited 
only by the SNR and the confidence in the model. When the SNR is very high, the 


0 


(10.77) 


517 


10.4 Visibility Model Fitting 


“snipe. Jouul = IP ‘snipes Joyno = P p 

‘asIMIOyIO 0 = (V| | 1 > x> O'T = ODI] uonoung Suea yun poypow ‘| | 5 

[bv — 1]0)A = ()A ‘uotsuedxo 10[APL,g 

“(HLOZ) Te 19 TEPIA -WEN pue ‘(8007) ‘Te 19 BN (STOT) AoURGOT ves ‘sUIIITIOS]e SUNY pUe STOPOU [eUOTIIPpr JO], 


(boug) 
[(b0uZ) soo boug — (bv xg)uts] Dm = 
t pe yo e puz) & pi HUG) -1 aroyds unopun 
7D UIT (boug) Er a er d AR G : 
PT PUT bou? PT DTU Pelt? uerssney 
Ip — 
Z 2.26 I ie Dipu I binu z Ip tp 
Ip ial a Eag em E g wr (2) U-(#)u psnjnaiy 
bou D 
E we Torz?” o (PU sia 
Deu I (Douz) Or VC (0 — 4g Su 
= = I - (Dg uonouny eoq 
aV %7/0)4 %7/(b)A WHMA 7/1 PPOWN 


suonnqysIp 2mos INIWWÁS AT[eyINUIZe JO} suonouny AUTIGISIA, TOT AQEL 


518 10 Calibration and Imaging 


size can be determined even though it is much less than the nominal beam size. 
Model fitting should not be confused with super-resolution deconvolution.” 


10.4.3 Modeling Azimuthally Symmetric Sources 


A very important class of models is those that have azimuthal symmetry, i.e., 
I(l,m) = I(r), where r = ~Ê + m’. For the following analysis, the position of 
the source is assumed to be known. In this case, the Fourier transform between 
the image and visibility becomes a Hankel transform [see Bracewell (1995, 2000), 
Baddour (2009)], i.e., 


V(q) = 27 [© Jo (27rq)r dr (10.78) 
0 

I(r) = 20 [ vo Jo (21 1rq)q dq , (10.79) 
0 


where q = Vu? + v2. V(q) is a real function, i.e., the visibility phase is zero. 
A useful model is one of a uniform bright circular source of intensity Jy and 
radius a. Since f Jo(x) xdx = xJ\(x), 


27, Jı(2xaq) 
oo., 
raq 


V(q) = xa (10.80) 


where Jı (2xaq)/maq = 1 for q = 0 and za7Ip = So, the total flux density. The 
visibility of an annulus of inner and outer radii a; and az can be represented as the 
difference of two disk visibility functions 


, (10.81) 


J (27 J (21 
V(q) = vn e Teg) = T eae. aig) 
2 


mtaiq 


i.e., the difference of two area-normalized jinc functions. The visibility functions for 
these and a number of other models are listed in Table 10.2 and shown in Fig. 10.11. 
An important lesson is that circularly symmetric models are very hard to distinguish 
for short baselines where the visibility decreases quadratically according to a size 
parameter. It is interesting to compare the visibility functions for a ring and thin 
annular disk, as shown in Fig. 10.12. The visibilities become significantly different 
only when q reaches about 1/ring thickness. 

A useful model for the analysis of an azimuthally symmetric source might be a 
superposition of annuli in image space with intensities J; and outer and inner radii 


?In some fields, such model fitting is called “breaking the diffraction barrier” [e.g., Betzig et al. 


(1991)]. 
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Fig. 10.11 Normalized visibility models, |V|/Vo, vs. projected baseline length, q, for 
azimuthally symmetric source models described in Table 10.2. 
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Fig. 10.12 (thin line) Visibility amplitude for a ring source with radius 1. (thick line) Visibility 
amplitude for an annular source with inner and outer radii of 0.8 and 1.2, respectively. Adapted 
from Bracewell (2000). 
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of a; and a;_,. The inner radius of the innermost annuli is zero, so it is a disk. The 
visibility function is 


Jı xaq) 
Vq)=x loay— 
Ji(2 
ha ı(2xaoq) 
aid Taoq 


Jı Qran) _ gine raq) 


2 
paa ad 
A 


+ xha 
ang xaq 
Ji (2an Jı (2ra 
+ aig? T = ae ew (10.82) 


Wang TAn—-14 


For the case of a uniform disk, all the J;s are the same, i.e., Jo, and the visibility is 
that of a uniform disk of radius a, and intensity I, 


alo (274g) 


V(q) = na; (10.83) 
Tang 
as expected. Equation (10.82) can be rearranged as 
n—1 
Ji (20a; J (27a, 
Vig) =r - A jag pe. (10.84) 
i=0 wai Tang 


Equation (10.84) can be fitted to data from sources with elliptical symmetry by a 
simple change in coordinates. 


10.4.4 Modeling of Very Extended Sources 


The technique of visibility modeling can be of particular importance for diffuse 
symmetric sources. The models for these sources often do not have finite moments, 
although they can have well-defined visibility functions. However, the Taylor 
expansion of visibility function around q = 0 described in the previous section 
cannot be used. We discuss two important practical examples. 

The first example is that of a radio source created by a fully ionized wind, i.e., 
thermal plasma at constant temperature T,, surrounding a star. If the wind has a 
constant velocity of expansion, the electron density will decrease as the inverse 
square of the distance from the star. It can be shown (Wright and Barlow 1975) 


10.4 Visibility Model Fitting 521 


— 1 
1h F -e | 
08 os g 08 BO 
= 5 oat 
z 3 \ Bo N 
> § 001} 2 o6 \ S o7 
= 2 > \ a 
@ 06 £ \ os} 
5 0.001 3 Boal SS 
E a 04 \ O O02 OOF O08 Oe OT OTD O14 
on \ 0.00015 5 10 15 20 E » Projected Baseline 
\ Radius 3 0.2 Ny 
02 Ne N 
é 0 — d 
ol ——=—= — 
0 1 2 3 4 5 (o) 0.2 0.4 06 0.8 1 
Radius Projected Baseline 


Fig. 10.13 (left) The intensity distribution defined by Eq. (10.85) for a stellar wind source where 
the radius is in units of a. The inset shows the intensity on a logarithmic scale. (right) Visibility 
function for the intensity distribution. The inset shows the visibility function near q = 0 and also 
for the case in which the intensity distribution is truncated at r = 5a. Note that the visibility 
function departs from the untruncated distribution for q < 1/truncation radius and approaches 
q = 0 quadratically. 


that the intensity distribution for such a source can be written as 


I(r) = bfl-e 
zlo, rga, (10.85) 


x Ip(r/a) , ra, 


where Ip = 2kT,(v/ c}? (the Planck function), and a is the angular radius where 
the optical depth is unity. The rather benign-looking intensity profile, shown in 
Fig. 10.13, has an FWHM of about 1.25a, and the intensity falls off as r°. The 
flux density is given by 


27a lo 


=) 10.86 
° ATOA) aa 


where IĮ is the gamma function. So is 1.3 times the flux density of a uniformly 
bright source of radius a. This source has the interesting characteristic that its 
angular size varies as v?7 (because a scales as v™>!), and the flux density varies 
as v’ (see the example of MWC349A in Fig. 1.1). However, the second and higher 
moments of the intensity distribution are infinite. Nonetheless, the visibility function 
can be calculated from Eq. (10.78). It is shown in Fig. 10.13. It has the interesting 
characteristic that it decreases linearly (rather than quadratically) with q, that is, 


V(q) ~ So — bq) , (10.87) 


where b = 27 /So. This behavior can be understood intuitively from the fact that the 
source extends smoothly to infinity. Hence, the correlated flux density continues to 
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increase as the baseline decreases to zero. Such a visibility curve has been observed 
[e.g., White and Becker (1982) and Contreras et al. (2000)] down to the shortest 
baselines used for the measurements. From the zero spacing flux, V(0) = So, 
and the slope of the normalized visibility curve, b, we can determine the electron 
density at a reference distance and the electron temperature (Escalante et al. 1989). 
A more realistic model is one with an ionization cutoff at some distance from the 
star, which truncates the radio emission. Making the source finite in extent makes 
all the moments finite, and the visibility function, shown in Fig. 10.13 (right) is 
dominated by a quadratic term at zero baseline. In this case, the outer radius of the 
source can be found from the visibility curvature at q = 0 as well as the density 
parameter and electron temperature. 

The second example is useful in modeling the Sunyaev—Zeldovich effect. An 
isothermal spherical distribution of ionized gas in a cluster of galaxies causes a 
decrement in the cosmic microwave background. For many clusters, the profile of 
this decrement can be modeled as 

I(r) = S ; (10.88) 
1+ (4) 


where Jp is the decrement at the cluster center, and a is the cluster core angular 
radius. The visibility function for this distribution has the analytic form (Bracewell 
2000) 


—2raq 
V(q) = 2xal 5 


(10.89) 


raq ` 


The visibility increases very rapidly as q decreases, and synthesis images made 
with missing short spacings are likely to underestimate Jọ. However, the parameters 
Io and a can be readily estimated by fitting Eq. (10.89) to the visibility data (Hasler 
et al. 2012; Carlstrom et al. 1996). As with the wind case of the stellar wind source, 
an actual cluster source will be truncated at some radius, re, which will keep the flux 
density finite and will give the visibility function a parabolic shape for baselines less 
than 1/re. 


10.5 Spectral Line Observations 


A basic requirement for observation of spectral lines is a receiving system that pro- 
vides measurements of the signal intensity in a bandwidth less than, or comparable 
to, that of the expected spectral feature. Thus, a spectral line correlator produces 
separate visibility measurements at many points across the receiver passband, and 
the intensity distribution of the line features can be obtained. The data reduction 
involved is in principle the same as used in continuum imaging but differs in some 
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practical details. The number of channels into which the received signal is divided 
is typically in the range 100—10,000. The discussion in this section is largely based 
on Ekers and van Gorkom (1984) and van Gorkom and Ekers (1989). 

Calibration of the instrumental bandpass response is perhaps the most important 
step in obtaining accurate spectral line data. Generally, the channel-to-channel 
differences are relatively stable with time and need not be calibrated as frequently 
as the time-variable effects of the overall receiver gain. Except in very early 
systems, the channel filtering (see Sect. 8.8) is performed digitally and is not 
susceptible to ambient variations in temperature or voltage. The overall gain 
variations require periodic observation of a calibration source, as described for 
continuum observations. For this purpose, the summed response of the individual 
channels is often used, since a much longer observing time would be required to 
obtain a sufficient SNR in each narrow channel. For the bandpass calibration, a 
longer observation of a calibrator can be made to determine the relative gains of 
the spectral channels. Since the relative gains of the different channels into which 
the passband is divided change very little with time, the bandpass calibration need 
only be performed once or twice during, say, an 8-h observation. The bandpass 
calibration source should be unresolved and strong enough to provide good SNR in 
the spectral channels and should have a sufficiently flat spectrum. However, it need 
not be close in position to the source being observed. 

Bandpass ripples resulting from standing waves between the antenna feed and 
the reflector, which pose a serious problem for single-antenna total-power systems, 
are much less important for interferometers. This is because the instrumental noise, 
including thermal noise picked up in the antenna sidelobes, is not correlated between 
antennas. On the other hand, for digital correlators, the Gibbs phenomenon ripples in 
the passband, which arise in Fourier transformation from the delay to the frequency 
domains, introduce a problem not found in autocorrelators. Because the cross- 
correlation of the signals from two antennas is real but not symmetrical as a function 
of delay, the cross power spectrum as a function of frequency is complex. (The 
autocorrelation function of the signal from a single antenna is real and symmetrical, 
and the power spectrum is real.) As explained in Sect. 8.8.8 (see Fig. 8.18), the 
imaginary part of the cross power spectrum changes sign at the origin, but the real 
part does not. Because of this large discontinuity at the frequency origin, ripples in 
the imaginary part of the frequency spectrum are of larger relative amplitude than 
those in the real part. The peak overshoot in the imaginary part is 18% (9% of the 
full step size); see also Bos (1984, 1985). Figure 10.14 shows a calculated example. 
The ratio of the real and imaginary parts depends on the instrumental phase (which 
is not calibrated out at this stage of the analysis) and on the position of the source 
of the radiation relative to the phase center of the field. 

Increasing the number of lags of a lag correlator, or the size of the FFT in an 
FX correlator, improves the spectral resolution and confines the Gibbs phenomenon 
ripples more closely to the bandpass edges. The data from the channels at the 
band edges are sometimes discarded because of the ripples and the roll-off of the 
frequency response. However, variations in the passband are less important in later 
systems in which the signals are in digital form and the passband is defined by 
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Fig. 10.14 (a) The cross power spectrum resulting from a continuum source in which the phase is 
arbitrarily chosen such that the amplitudes of the real and imaginary parts are equal. (b) Computed 
response of a cross-correlator with 16 channels to the spectrum in (a). Note the difference in 
amplitude of the ripples in the real and imaginary parts. From D’ Addario (1989), courtesy of and 
© the Astronomical Society of the Pacific. 


digital filtering. An effective way to reduce the amplitude of the ripples is to taper 
the cross-correlation function and thus introduce smoothing into the cross power 
spectrum. For this smoothing, the Hann function (see Table 8.5) is often used. van 
Gorkom and Ekers (1989) draw attention to the following examples: 


1. If the field contains a line source but no continuum, and the line is confined to 
the central part of the passband, then the spectrum has no discontinuity at the 
passband edges. This is the only case in which it is advisable to use different 
tapering of the cross-correlation function for the source and the continuum 
calibrator. 

2. If, in addition to the line source, the field contains one continuum point source, 
and if both this source and the bandpass calibrator are at the centers of their 
respective fields, then an accurate calibration of the bandpass ripples is possible. 
The same weighting must be used for the source and calibrator. 

3. In more complicated cases—for example, when there is both a line source and an 
extended continuum source within the field—the ripples will be different in the 
two cases, and exact calibration is not possible. Hann smoothing of the spectra 
of both the source and the calibrator is recommended. 


10.5.1 VLBI Observations of Spectral Lines 


Since VLBI observations are limited to sources of very high brightness temperature, 
spectral line measurements in VLBI are used mainly for the study of masers 
and absorption of emission from bright extragalactic sources by molecular clouds. 
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Frequently observed maser lines include those arising from OH, H20, CH30H, and 
SiO. For absorption studies, many atomic and molecular species can be observed 
since the brightness temperature requirement is fulfilled by the background source. 
The formalism of spectral line signal processing is described in Sect. 9.3. Special 
considerations for astrometric measurements are given in Sect. 12.7. Here we 
discuss several practical issues related to the handling of spectroscopic data. The use 
of independent frequency standards at the antennas results in time-dependent timing 
errors, which introduce linear phase slopes across the basebands. The difference in 
Doppler shifts among the antennas can be large, and hence the residual fringe rates 
can also be large, which may necessitate short integration times for calibration. For 
masers, the phase calibration can usually be obtained from the use of the phase of a 
particular spectral feature as a reference. The amplitude calibration can be obtained 
from the measurement of the spectra derived from the data recorded at individual 
antennas. More details of procedures for handling spectral line data can be found in 
Reid (1995, 1999). 

In spectral line VLBI, it is usual to observe a compact continuum calibrator 
several times an hour, preferably one strong enough to give an accurate fringe 
measurement in | or 2 min of integration. If a lag-type correlator is used to cross- 
correlate the signals, the output is a function of time and delay. Equation (9.21), in 
which At, and 6; are functions of time, shows cross-correlation as a function of 
time and delay. By Fourier transformation, the arguments t and t can be changed 
to the corresponding conjugate variables, which are fringe frequency, vy, and the 
frequency of the spectral feature, v, respectively. Thus, the correlator output can be 
expressed as a function of (t, T), (vf, T), (t, v), or (vp, v) and can be interchanged 
between these domains by Fourier transformation. This is important because some 
steps in the calibration are best performed in particular domains. Note that the fringe 
frequency in VLBI observations results mainly from the difference between the true 
fringe frequency and the model fringe frequency used to stop the fringes. Consider 
first the data from the continuum calibrator. In fringe fitting for a continuum source, 
it is advantageous to use visibility data as a function of fringe frequency and delay, 
(vf, T), as Shown in Fig. 9.7. In that domain, the visibility data are most compactly 
concentrated and therefore most easily identified in the presence of the noise. In 
the absence of errors, the visibility will be concentrated at the origin in the (vs, T) 
domain. A shift from the origin in the t coordinate indicates timing errors resulting 
from clock offsets or baseline errors. The shift At represents the difference in the 
errors for the two antennas. Values of At determined from the continuum calibrator 
are used to apply corrections to the spectral line data. Variation of the At values over 
time requires interpolation to the times of the spectral line data. The continuum data 
can also be used for bandpass calibration, to determine the relative amplitude and 
phase characteristics of the spectral channels. 

For fringe fitting to spectral line data, it is advantageous to transform to the 
(t, v) domain since, in contrast with the continuum case, the spectral line data 
contain features that are narrow in frequency. The cross-correlation function is 
therefore correspondingly broad in the delay dimension and generally more compact 
in frequency. Note that in the t-to-v transformation, v is not the frequency of the 
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radiation as received at the antenna, since the frequency of a local oscillator (or a 
combination of more than one local oscillators), vio, has been subtracted. Thus, 
v here represents the frequency within the intermediate-frequency (IF) band that 
is sampled and recorded for transmission to the correlator. The (f, v) domain is 
also appropriate for inserting corrections for the timing errors, At, determined from 
the continuum data. These corrections are made by inserting phase offsets that are 
proportional to frequency. Thus, the data as a function of (t, v) are multiplied by* 
exp(j27 Atv). If the variation in the At values over time results from a clock rate 
error at one or both of the antennas, correction should be made for the associated 
error in the frequency vro at the antennas. The resulting phase error is corrected by 
multiplying the correlator output data by exp(j27 Atv). 

Since Doppler shift corrections (see Appendix 10.2) are rarely made as local 
oscillator offsets at the antennas, these corrections must be made at the correlator or 
subsequently in the post-processing analysis. The diurnal Doppler shift is normally 
removed at the station level in the precorrelation fringe rotation, where the signals 
are delayed and frequency-shifted to a reference point at the center of the Earth. 
Correction for the Doppler shift due to the Earth’s orbital motion and the local 
standard of rest, as well as any other frequency offset, can conveniently be made 
on the post-correlation data by use of the shift theorem, that is, multiplication of 
the correlation functions by exp(j27 Avt), where Av is the total frequency shift 
desired. 

The visibility spectra can be calibrated in units of flux density by multiplication 
of the normalized visibility spectra by the geometric mean of the system equivalent 
flux densities (SEFDs) of the two antennas concerned, as discussed in Sect. 10.1.2. 
The SEFD is defined in Eq. (1.7). It can be determined from occasional supple- 
mental measurements at the antennas, and the results interpolated in time. A better 
method for strong sources is to calculate the total-power spectrum of the source from 
the autocorrelation functions of the data from each antenna. These must be corrected 
for the bandpass response, which can be obtained from the autocorrelation functions 
on a continuum fringe calibrator. The amplitude of a specific spectral feature is 
proportional to the reciprocal of the SEFD. If greater sensitivity is required, then 
each measured spectrum can be matched to a spectral template obtained from a 
global average of all the single-antenna data or from a spectrum taken with the most 
sensitive antenna in the array. The difficulty with this method is that it is seldom 
convenient to acquire bandpass spectra often enough to ensure sufficiently accurate 
baseline subtraction on weak sources. 

If the total frequency bandwidth in the measurements is covered by using two or 
more IF bands of the receiving system, it is necessary to correct for differences in 
their instrumental phase responses. This can be done using the continuum calibrator 
measurements, by averaging the phase values for the different channels in each IF 


Note that the required sign of the exponent in this and similar expressions used in this subsection 
may be positive or negative, depending on the sign conventions used. 


10.5 Spectral Line Observations 527 


band and subtracting these averages from the corresponding spectral line visibility 
data. 

Finally, it is necessary to correct for remaining instrumental phases and for the 
different atmospheric and ionospheric phase shifts, which may be large for widely 
separated sites. In imaging strong continuum sources, this can be achieved by using 
phase closure, as described in Sect. 10.3. A similar approach can be used in imaging 
a distribution of maser point sources, by selecting a strong spectral component that is 
seen at all baselines and assuming that it represents a single point source. Then if the 
phase for this component at one arbitrarily chosen antenna is assumed to be zero, the 
relative phases for the other antennas can be deduced from the fringe phases. Since 
these phases are attributed to the atmosphere over each antenna, the correction can 
be applied to all frequency components within the measured spectrum. This method 
of using one maser component to provide a phase reference is discussed in more 
detail in Sect. 12.7, together with fringe frequency mapping, a technique that is 
useful in determining the positions of major components in a large field of masers. 


10.5.2 Variation of Spatial Frequency Over the Bandwidth 


The effect of using the center frequency of the receiver passband in calculating the 
values of u and v for all frequencies within the passband is discussed in Sect. 6.3.1. 
Consider, for example, a single discrete source for which the visibility function has 
a maximum centered on the (u, v) origin and decreases monotonically for a range of 
increasing u and v. If we use the frequency at the band center vp to calculate u and 
v for a frequency at the high end of the band, that is, v > vo, then the values of u 
and v will be underestimated. The measured visibility will fall off too quickly with 
u and v, and the central peak of the visibility function will be too narrow. Hence, 
the width of the image in / and m will be too wide. Thus, if the source radiates a 
spectral line at the blueshifted side of the bandwidth, the angular dimensions may 
be overestimated and similarly underestimated at the redshifted side. This effect can 
be described as chromatic aberration. 

As discussed in Sect. 6.3, for observations with a spectral line (multichannel) 
correlator, the visibility measured for each channel can be expressed as a function 
of the (u,v) values appropriate for the frequency of the channel. This corrects 
the chromatic aberration but causes the (u,v) range over which the visibility is 
measured to increase over the bandwidth in proportion to the frequency. Thus, the 
width of the synthesized beam (i.e., the angular resolution) and the angular scale 
of the sidelobes vary over the bandwidth. The variation of the resolution can, if 
necessary, be corrected by truncation or tapering of the visibility data to reduce the 
resolution to that of the lowest frequency within the passband. 
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10.5.3 Accuracy of Spectral Line Measurements 


The spectral dynamic range of an image after final calibration is an estimate of 
the accuracy of the measurement of spectral features expressed as a fraction of the 
maximum signal amplitude. It can be defined as the variation in the response of 
different channels to a continuum signal divided by the maximum response, the 
variation being a result of noise and instrumental errors. When the amplitude of a 
spectral line is only a few percent of the continuum that is present, as in the case of 
a recombination line or a weak absorption line, the accuracy of spectral line features 
depends on the accuracy with which the response to the continuum can be separated 
from that to the line. In such a case, a dynamic range of order 10° is required to 
measure a line profile to an accuracy of 10%. Hence, we see the importance of 
accurate bandpass calibration and of correction for chromatic aberration. 

Various techniques have been used to help subtract the continuum response from 
an image. It is necessary to choose the receiver bandwidth so as to include some 
channels that contain continuum only, at frequencies on either side of the line 
features. A straightforward method is to use an average of the line-free channel data 
to make a continuum image and subtract this image from each of the images derived 
for a channel with line emission. Unless the receiver bandwidth is sufficiently small 
compared with the center frequency, it is likely that a correction for chromatic 
aberration should be used in making the continuum image. If the continuum 
emanates from point sources, the positions and flux densities of the sources provide 
a convenient model. For the most precise subtraction, the continuum response 
should be calculated separately for each line channel, using the individual channel 
frequencies in determining the (u, v) values. The subtraction should be performed 
in the visibility data. Use of deconvolution algorithms in the continuum subtraction 
is briefly discussed in Sect. 11.8.1. 


10.5.4 Presentation and Analysis of Spectral Line Observations 


Spectral line data can be presented as three-dimensional distributions of pixels in 
(l,m, v). For physical interpretation, the Doppler shift in the frequency dimension 
is often converted to radial velocity v, with respect to the rest frequency of the 
line. The relationship between frequency and velocity is given in Appendix 10.2. A 
model of such a three-dimensional distribution is shown in Fig. 10.15. Continuum 
sources are represented by cylindrical functions of constant cross section in / and m. 

The three-dimensional data cube that contains the images for the individual 
channels can be thought of as representing a line profile for each pixel in two- 
dimensional (l,m) space. To simplify the ensemble of images, it is often useful 
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Fig. 10.15 Three-dimensional representation of spectral line data in right ascension, declination, 
and frequency. The frequency axis is calibrated in velocity corresponding to the Doppler shift of 
the rest frequency of the line. The flux density or intensity of the radiation is not shown but could 
be represented by color or shading. The indicated velocity has no physical meaning for continuum 
sources, which are represented by cylindrical forms of constant cross section normal to the velocity 
dimension. Spectral line emission is indicated by the variation of position or intensity with velocity. 
From Roelfsema (1989), courtesy of and © the Astronomical Society of the Pacific. 


to plot a single (l, m) image of some feature of the line profile. This feature might 
be the integrated intensity 


Av 7 I(l), (10.90) 


where i indicates the range of spectral channels, which are spaced at intervals Av in 
frequency. For an optically thin radiating medium such as neutral hydrogen, this is 
proportional to the column density of radiating atoms or molecules. The intensity- 
weighted mean velocity is an indicator of large-scale motion, 


YL, m)vy 
r(l, = =. 10.91 
Wam = SE (10.91) 
The intensity-weighted velocity dispersion 
Li l; a r 2 
Dl mn = (v) A 


ACs m) 


is an indicator of random motions within the source. The summation in the velocity 
dimension is performed separately for each (l, m) pixel of the images. In each of 
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the three quantities in expressions (10.90)-(10.92), the intensity values correspond 
to the specific line of interest, continuum features having been subtracted out. 
In obtaining the best estimates for these three quantities, it should be noted that 
including ranges of (l, m, v,) that contain no discernable emission only adds noise 
to the results. 

Exploring the relationships between three-dimensional images in (l, m, v,) and 
the three-dimensional distribution of the radiating material is an astronomical 
concern. As a simple example, consider a spherical shell of radiating material. If 
the material is at rest, it will appear in (l, m, v,) space as a circular disk in the plane 
of zero velocity, with brightening at the outer edge. If the shell is expanding with 
the same velocity in all directions, it will appear in (/,m,v,) space as a hollow 
ellipsoidal shell. Interpretation of observations of rotating spiral galaxies is more 
complex. An example of a model galaxy is given by Roelfsema (1989), and a more 
extensive discussion can be found in Burton (1988). 


10.6 Miscellaneous Considerations 


10.6.1 Interpretation of Measured Intensity 


The quantity measured in a synthesized image is the radio intensity, but V is usually 
calibrated in terms of the equivalent flux density of a point source, and the intensity 
unit in the resulting image is in units of flux density per beam area §29, which is 
given by 


P -Jf bo(l, m) dl dm (10.93) 
i min YZP m 


The response to an extended source is the convolution of the sky intensity I(l, m) 
with the synthesized beam bo(/,m). Note that since there is often no measured 
visibility value at the (u, v) origin, the integral of bọ(l, m) over all angles is zero; 
that is to say, there is no response to a uniform level of intensity. At any point on 
the extended source where the intensity varies slowly compared with the width of 
the synthesized beam, the convolution with bo(/, m) results in a flux density that is 
approximately 129. Thus, the scale of the image can also be interpreted as intensity 
measured in units of flux density per beam area S29. For a discussion of imaging 
wide sources and measuring the intensity of extended components of low spatial 
frequency, see Sects. 11.5 and 10.4. 


10.6.2 Ghost Images 


Figure 10.14 illustrates how bandpass ripples are introduced into the visibility as a 
function of frequency, as a result of the sharp edges in the cross power spectrum. A 
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related effect discussed by Bos (1984) is the introduction of “ghost” images into the 
image derived from the observations. The ghost structure appears at a position that, 
relative to the true structure, is diametrically opposite with respect to the field center. 
For each spectral channel, the amplitude of the ghost structure is proportional to the 
amplitude of the ripple component. Thus, it is most serious for the channels at the 
edges of the receiver passband, as can be seen from Fig. 10.14b. 

The ghost phenomenon is most easily explained by considering a simple exam- 
ple. Suppose there is a point source of unit amplitude at position (l,m) = (4,0), 
where (0, 0) is the field center, and it is observed over a range of baselines u. The 
fringe visibility of a point source is the Fourier transform‘ with respect to / of a delta 
function at /;, which is 


Vi (u) = e P| = cos(2rul;) — j sin(2xul,) . (10.94) 


Suppose that a multichannel spectral correlator is used and there is a visibility data 
set for each spectral channel. The ripples across the spectrum in Fig. 10.14 have the 
effect that the relative amplitudes of the sine and cosine components are no longer 
equal, as they are in Eq. (10.94), so we rewrite Eq. (10.94) as 


V\(u) = cos(27ul,) — j(1 + A) sinQ2zul,) . (10.95) 


Here, a component of relative amplitude A has been added to the imaginary 
component, which has the most severe ripples. A is positive for a channel in which 
there is a peak in the imaginary-component ripple. To determine the effect of the 
term —jA sin(2zul,) in the image, we take its Fourier transform with respect to u, 
which is A[6(u+/,)—6(u—/,)]/2. Thus, the ripple adds to the image a delta function 
of amplitude A/2 at —/,, which is the ghost, and subtracts a delta function of the 
same amplitude from the true image? at /;. For a source at the field center, the ghost 
and the true image combine, providing a correct measure of the source intensity. 
Since the visibility data are usually not calibrated prior to the spectral filtering, 
the relative amplitudes of the real and imaginary components in Eq. (10.94) result 
from the instrumental phases introduced by the receiving system as well as from 
the structure of the source. If these instrumental phase data are lost after calibration 
of the visibility, precise removal of the ghost is not possible. However, the effect 
of the ripples can be reduced by use of smoothing functions on the spectral data 
before creating the image, as discussed earlier. If the spectral data are averaged to 
provide a continuum result before assigning (u, v) values, the effect of the frequency 


“In the Fourier transformations used here, we follow Bracewell (2000), who, for the delta (impulse) 
function, defines a “transform in the limit” by considering two Gaussian functions, lalewzeP and 
en u/ ay that are a Fourier pair. As a — oo, the first Gaussian tends toward a delta function at 
the /-origin and the second toward unity. For a delta function at /;, we use the shift theorem and 
multiply by e727", 

5Bos (1984, 1985) refers to the ripple-induced component at the true image position as the “hidden 
component.” 
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difference of the channels with high amplitude ripples at the two edges of the 
passband may be sufficient to separate the ghost into two components, as shown 
by Bos (1985). This separation will not occur if the (u, v) values are individually 
assigned for each spectral channel. 

Bos (1984) points out that the ghost can be removed, or substantially attenuated, 
by 2/2 switching of the relative phase between each signal pair before cross 
correlating, and restoring the phase before transformation of the visibility data to 
form an image. For the source considered in Eq. (10.94), the introduction of 2/2 
into the differential phase for an antenna pair results in the visibility 


V2(u) = je? = jcos(2mul) + sin(2zul)) . (10.96) 


The imaginary part consists of the cosine components, which are the real part in 
Eq. (10.94). Adding the visibility term resulting from the ripples in the imaginary 
part of the spectrum, as in Eq. (10.95), we have 


V2(u) = jeP™ = j(1 + A) cos(2rul;) + sin(27xul;) . (10.97) 


To remove the effect of the quadrature phase switch, we multiply Eq. (10.96) by 
j. The visibility term introduced by the ripple then becomes —A cos(2zrul,), and 
taking the Fourier transform with respect to u, we find that the contribution of the 
ripple to the image is —A[5(u+/,) + 6(u—1,)]/2. Again, there are delta functions at 
+/,, but in this case, they both have the same sign. Thus, the result of averaging the 
images with the two positions of the phase switch is to cancel the ghost but double 
the amplitude loss of the true image. Note that we have assumed that the quadrature 
phase shift introduced by the switch can be represented by the factor j in Eq. (10.96): 
If the sign of the phase shift is such that the factor is —j, then the sign of the right 
side of Eq. (10.96) must be reversed. If the sign is wrong, the effect is to double the 
amplitude of the ghost but restore the amplitude of the image. 


10.6.3 Errors in Images 


A very useful technique for investigating suspicious or unusual features in any 
synthesized image or continuum or spectral line is to compute an inverse Fourier 
transform (i.e., from intensity to visibility), including only the feature in question. 
A distribution in the (u,v) plane concentrated in a single baseline, or in a series 
of baselines with a common antenna, could indicate an instrumental problem. A 
distribution corresponding to a particular range of hour angle of the source could 
indicate the occurrence of sporadic interference. 

An aid in identifying erroneous features is a familiarity with the behavior of 
functions under Fourier transformation; see, for example, Bracewell (2000) and 
the discussion by Ekers (1999). A persistent error in one antenna pair will, for 
an east-west spacing, be distributed along an elliptical ring centered on the (u, v) 
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origin, and in the (/,m) plane will give rise to an elliptical feature with a radial 
profile in the form of the zero-order Bessel function. An error of short duration on 
one baseline introduces two delta functions representing the measurement and its 
conjugate. In the image, these produce a sinusoidal corrugation over the (/, m) plane. 
The amplitude in the image plane may be only small, since in an M x N visibility 
matrix, the effect of the two erroneous points is diluted by a factor of 2(MN)~!, 
which is usually of order 1073—1076. Thus, a single short-duration error could be 
acceptable if, in the image plane, it is small compared with the noise. 

Errors of an additive nature combine by addition with the true visibility values. 
In the image, the Fourier transform of the error distribution £aqa(u, v) is added to the 
intensity distribution, and we have 


V(u,v) + eaaa lu, v) <> I(l, m) + Eaaa(l, m) . (10.98) 


Other types of additive errors result from interference, cross coupling of system 
noise between antennas, and correlator offset errors. The Sun is many orders of 
magnitude stronger than most radio sources and can produce interference of a 
different character from that of terrestrial sources because of its diurnal motion. The 
response to the Sun is governed mainly by the sidelobes of the primary beam, the 
difference in fringe frequencies for the Sun and the target source, and the bandwidth 
and visibility averaging effects. Solar interference is most severe for low-resolution 
arrays with narrow bandwidths. Cross coupling of noise (cross talk) occurs only 
between closely spaced antennas and is most severe for low elevation angles when 
shadowing of antennas may occur. 

A second class of errors comprises those that combine with the visibility in a 
multiplicative manner, and for these, we can write 


V (u, V)Emu (u, v) <> I(l, m) * * Emu (l, m) . (10.99) 


The Fourier transform of the error distribution is convolved with the intensity 
distribution, and the resulting distortion produces erroneous structure connected 
with the main features in the image. In contrast, the distribution of errors of 
the additive type is unrelated to the true intensity pattern. Multiplicative errors 
mainly involve the gain constants of the antennas and result from calibration errors, 
including antenna pointing and, in the case of VLBI systems, radio interference (see 
Sect. 16.4). 

Distortions that increase with distance from the center of the image constitute 
a third category of errors. These include the effects of noncoplanar baselines 
(Sect. 11.7), bandwidth (Sect. 6.3), and visibility averaging (Sect. 6.4), which are 
predictable and therefore somewhat different in nature from the other distortions 
mentioned above. 
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10.6.4 Hints on Planning and Reduction of Observations 


Making the best use of synthesis arrays and similar instruments requires an 
empirical approach in some areas, and the best procedures for analyzing data are 
often gained by experience. Much helpful information exists in the handbooks 
on specific instruments, symposium proceedings, etc. [see, for example, Perley, 
Schwab, and Bridle (1989) and Taylor, Carilli, and Perley (1999)]. A few points 
are discussed below. 

In choosing the observing bandwidth for continuum observations, the radial 
smearing effect should be considered, since the SNR for a point source near the 
edge of the field is not necessarily maximized by maximizing the bandwidth. Then 
in choosing the data-averaging time, the resulting circumferential smearing can be 
about equal to the radial effect. The required condition is obtained from Eqs. (6.75) 
and (6.80) and for high declinations is 


Av 
— YW . (10.100) 
Vo 


Here, vo is the center frequency of the observing band, Av is the bandwidth, we 
is the Earth’s rotation velocity, and t, is the averaging time. When attempting to 
detect a weak source of measurable angular diameter, or an extended emission, it 
is important not to choose an angular resolution that is too high. The SNR for an 
extended source is approximately proportional to Z920, as discussed in the previous 
section. The observing time required to obtain a given SNR is proportional to 277, 
or to 6, 4 where 0; is the synthesized beamwidth. 

If the antenna beam contains a source that is much stronger than the features to 
be studied, the response to the strong source can be subtracted, provided it is a point 
source or one that can be accurately modeled. This is best done by subtracting the 
computed visibility before gridding the measurements for the FFT. The subtracted 
response will then accurately include the effect of the sidelobes of the synthesized 
beam. Nevertheless, the precision of the operation will be reduced if the source 
response is significantly affected by bandwidth, visibility averaging, and similar 
effects, so it may be best to place the source to be subtracted at the center of the 
field. When observing a very weak source, it may be advisable to place the source a 
few beamwidths away from the (/, m) origin to avoid confusion with residual errors 
from correlator offsets, etc. 

As part of the procedure in making any image, it may be useful also to make a 
low-resolution image covering the entire area of the primary antenna beam. For this 
image, the data can be heavily tapered in the (u, v) plane to reduce the resolution 
and thus also the computation. Such an image will reveal any sources outside the 
field of the final image that may introduce aliased responses in the FFT. Aliasing of 
these sources can be suppressed by subtraction of their visibility or use of a suitable 
convolving function. The sidelobe or ringlobe responses to such a source are also 
eliminated by subtraction of the source but not by convolution in the (u, v) plane. 
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The low-resolution image will also emphasize any extended low-intensity features 
that might otherwise be overlooked. 


10.7 Observations of Cosmological Fine Structure 


10.7.1 Cosmic Microwave Background 


The anisotropy of the cosmic microwave background (CMB), which is about 107° 
of the mean temperature of 2.7 K, was first detected by the COBE mission (Smoot 
et al. 1992), and its characteristics were explored in great detail by the WMAP 
mission (Bennett et al. 2003) and the Planck mission (Planck Collaboration 2016). 
The data from these missions were obtained using total-power beam-switching 
techniques, revealed a major peak in the angular spectrum of the background 
fluctuations at ~ 1.6°. Interferometry offers advantages for the study of the higher- 
resolution peaks that, like the major peak, are attributed to acoustic waves in the 
early photon-baryon plasma at the surface of last scattering. Since interferometers 
do not respond to uncorrelated signals such as those generated within the Earth’s 
atmosphere, it is possible to use ground-based interferometers for investigation of 
the finer angular structure of the CMB. A number of special instruments have been 
developed specifically to cover structure of angular range ~ 0.1° to ~ 3°. These 
include the Degree Angular Scale Interferometer (DASI) (Leitch et al. 2002b; Pryke 
et al. 2002), located at the South Pole; Cosmic Background Imager (CBI) (Padin 
et al. 2002; Readhead et al. 2004), located at Llano de Chajnantor, Chile; and the 
Very Small Array (VSA) (Watson et al. 2003; Scott et al. 2003), in Tenerife. Planar 
arrays, discussed in Sect. 5.6.5, were primarily used for this work. 

In the study of the fluctuations in the CMB, it is the statistics of the tem- 
perature variations rather than images of specific fields on the sky that are of 
interest for comparison with theoretical models. Model power spectra are given 
in terms of spherical harmonics, that is, the amplitudes of multipole moments 
of the temperature variation. Measurements of the angular spectrum of the CMB 
in this form can be derived directly from the Fourier components measured by 
interferometry without forming images of the structure on the sky. It is assumed 
that the CMB spectrum can be expressed as a function with circular symmetry 
(rotational invariance), since there is no preferred direction in the structure on the 
sky. Thus, characteristics of the CMB lead to some design considerations that differ 
from those for general-purpose synthesis arrays. The individual antennas need to 
be large enough to allow accurate phase and amplitude calibration with observing 
times of a few minutes, using strong discrete sources. With regard to the antenna 
configuration, the main requirement is to obtain sampling in a radial coordinate, 
q = Vu? + v?, in the (u, v) plane, rather than uniform sampling in two dimensions, 
as required for imaging. To obtain sufficiently fine sampling in g, the antennas were 
usually configured so that, considered pairwise, the spacing between centers from 
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the closest to the most widely spaced increases in increments that are smaller than 
the diameter of an antenna. This can be achieved, for example, by the curved arm 
configuration shown for the CBI in Fig. 5.24. 

In CMB measurements, it is also essential to be able to separate out the effects 
of all foreground sources. These signals can be identified by their spectral char- 
acteristics, which, for synchrotron or optically thin thermal emissions, differ from 
the blackbody spectrum of the CMB. Another requirement for CMB interferometry 
is sufficient frequency coverage to allow the spectral characteristics of signals 
to be determined. All three of the systems mentioned above used 10 GHz-wide 
receiving bandwidths of 26-36 GHz, subdivided into channels. These frequencies 
were chosen to be high enough to take advantage of the increase of CMB flux 
density with frequency and also to avoid H20 and O» atmospheric absorption lines. 

DASI was designed to provide measurements over a range of multipole moments 
£ = 100-900 and used 13 antenna of diameter 20 cm with baselines 0.25-1.21 m. 
For CBI, the range of £ is 400 — 4250, and 13 antennas of diameter 90 cm with a 
range of baselines 1—5.51 m were used. Each array was small enough to allow the 
antennas to be mounted on a mechanically rigid faceplate that could be pointed 
in azimuth and altitude so that the normal would track the center of the field 
under observation. The faceplate could also be rotated about its axis, to control 
the parallactic angle of the interferometer fringe patterns on the sky. No delay 
system or fringe rotation was needed, but phase switching was included to remove 
instrumental offsets. In CBI and DASI, the antennas were arranged in patterns with 
threefold symmetry, and thus, a rotation of the faceplate through 120° caused the 
configuration of the antennas to repeat relative to the sky (see Fig.5.24). This 
property was very useful since the response to the sky remains unchanged after 
such a rotation, and variations in the signals resulting from unwanted effects such 
as residual cross talk between antennas could be identified and removed. 

A further problem at the high levels of sensitivity required to observe the CMB 
structure results from thermal radiation from the ground and nearby objects, incident 
through the antenna sidelobes. This can introduce a serious unwanted contribution in 
the responses of the more closely spaced antenna pairs, but the effect decreases with 
increasing antenna spacing. For analysis of the results of observations of this type, 
see Hobson et al. (1995) and White et al. (1999). Further details of observations can 
be found in Leitch et al. (2002a,b) and Padin et al. (2002). 


10.7.2 Epoch of Reionization 


At redshifts corresponding to the period prior to the Epoch of Reionization (EoR), 
it should be possible to detect radiation of the neutral hydrogen line (1420 MHz 
rest frequency). As stars were formed in the early Universe, much of the hydrogen 
became ionized, and this period is referred to as the EoR. This probably occurred at 
a redshift no higher than about 7 or 8 (Morales and Wyithe 2010). Radiation at the 
frequency of the neutral hydrogen line should, in principle, be detectable at a redshift 
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corresponding to the beginning of the EoR or earlier and should be detectable 
in all directions over the sky. However, there is also the cosmic background and 
the foreground noise from our Galaxy, and the level of these exceeds the distant 
hydrogen line signal by an estimated factor of 104. For detection of a broad faint 
background of radiation, in contrast with detection of discrete sources, sensitivity 
can be increased by using a large number of small antennas, to maximize sensitivity 
to broad structural features. In the image domain, (l, m), the third variable added 
is the frequency, v, and in the spatial frequency domain, (u, v), the corresponding 
conjugate variable, represents time delay. A basic concern is how redundancy in the 
array configuration can be chosen to maximize the sensitivity to different angular 
scales in the search for the reionization signal. Further discussion of the challenges 
associated with EoR imaging can be found in Parsons et al. (2010, 2012, 2014); 
Zheng et al. (2013), and Dillon et al. (2015). 


Appendix 10.1 The Edge of the Moon as a Calibration 
Source 


During the test phase of bringing an interferometer into operation, it is useful 
to observe sources that produce fringes with high SNR. At frequencies above 
~ 100 GHz, there are not many such sources. The Sun, Moon, and planets, the 
disks of which are resolved by the interferometer fringes, can nevertheless provide 
significant correlated flux density because of their sharp edges. Consider the limb 
of the Moon and the case in which the primary beam of the interferometer elements 
is much smaller than 30’, the lunar diameter. When the antenna beam tracks the 
Moon’s limb, the apparent source distribution is the antenna pattern multiplied 
by a step function; it is assumed that the brightness temperature of the Moon 
is constant within the beam. Approximating the antenna pattern as a Gaussian 
function, assuming that the antennas track a fixed point on the west limb of the 
Moon, and ignoring the curvature of the lunar limb, we can express the effective 
source distribution as 


(x,y) =I ea 4 (In 2)? +y°)/45 x>0, 
1) = to = (A10.1) 
=0 x<0 7 


where x and y are angular coordinates centered on the beam axis, 6, is the full 
width of the beam at the half-power level, and in the Rayleigh—Jeans regime, Jọ = 
2kT,,/ 7, where T, is the temperature of the Moon. The visibility function is then 


Co 
Vu, v) = 21p p etn) /6; (cos 2ux — j sin 27ux) a 
0 


CO 
«| l e74 0n23?/0; cos2rvy d| . (A10.2) 
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The cosine integral is straightforward, and the sine integral can be written in terms 
of a degenerate hypergeometric function ;F; (see Gradshteyn and Ryzhik 1994, 
Eq. 3.896.3). The result is 


292,,2 
= —n262(u2+v2)/4 In 2 | T 1 3 x“0ju 

Viu, v) = Soe" 1- j Z Gow iFi (5.5. 7 
(u v) 090€ | J ad pu) 4 (5 5 4in2 


(A10.3) 


where 


ps a A10.4 
0 447 In2 or 
is the flux density of the Moon in the half-Gaussian beam. In the limit (u, v) > 
(0,0), the imaginary part of the visibility is zero, and V(u, v) = So, as expected. 
For T, = 200 K and 6, = 1.2A/d, where d is the diameter of the interferometer 
antennas in meters, Sy ~ 460,000/d? Jy. The integral over x in Eq. (A10.2) can 
also be written in terms of the error function. For the limit where u >> d/A, the 
asymptotic expansion of the error function leads to the convenient approximation 


4in2 So kTy 
Vu v=0=j PU o jogi, 
HUSMENN se pu aD 


(A10.5) 


where D is the baseline length. Hence, we have the interesting situation that the 
visibility for a given baseline length increases as the antenna diameter decreases, as 
long as 6, « 30’. The approximation in Eq. (A10.5) is accurate to 2% for D > 2d. 
The full visibility function as a function of projected baseline length is shown in 
Fig.A10.1. Note that the visibility measured with an interferometer having an east- 
west baseline orientation and tracking the north or south limb of the Moon will be 
essentially zero. In the general case, the maximum fringe visibility is obtained by 
tracking the limb of the Moon that is perpendicular to the baseline. 

Although the Moon may produce strong fringes, it is not an ideal calibration 
source. First, libration may make it difficult to track the exact edge of the Moon. 
Second, because the apparent source distribution is determined by the antennas, 
tracking errors introduce amplitude and phase fluctuations. Third, because the 
temperature of the Moon depends on solar illumination, variations around the mean 
temperature of ~ 200 K are significant, especially at short wavelengths. For accurate 
results, the lunar temperature variation should be incorporated into the brightness 
temperature model. 


Appendix 10.2 Doppler Shift of Spectral Lines 


The Doppler shift [e.g., Rybicki and Lightman (1979)] is given by the relation 


À vo 1 + 4cosé 
a fi EEE (A10.6) 
Xo Vv 
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Fig. A10.1 Normalized fringe visibility for an interferometer with an east—west baseline observ- 
ing the west limb of the Moon at transit (v = 0), vs. Opu. @p œ 1.2A/d is the half-power 
beamwidth of the antenna, d is the antenna diameter, and u = D/A is the baseline in wavelengths. 
On the horizontal axis, 0, u is approximately equal to 1.2D/d. The dotted line is the imaginary 
component of visibility, the dashed line is the real part, and the solid line is the magnitude. Since 
the portion of the curve for D/d < 1 is not accessible, the measured visibility is almost purely 
imaginary. For d = 6 m and D/d = 3, the zero-spacing flux density [see Eq. (A10.4)] is 12,700 
Jy, and the visibility is about 1000 Jy [see Eq. (A10.5)]. Adapted from Gurwell (1998). 


where Ao and vo are the rest wavelength and frequency as measured in the reference 
frame of the source, the corresponding unsubscripted variables are the wavelength 
and frequency in the observer’s frame, v is the magnitude of the relative velocity 
between the source and the observer, and @ is the angle between the velocity vector 
and the line-of-sight direction between source and observer in the observer’s frame 
(0 < 90° for a receding source). The numerator in Eq.(A10.6) is the classical 
Doppler shift caused by the change in distance between the source and the observer. 
The denominator is the relativistic time dilation factor, which takes account of the 
difference between the period of the radiated wave as measured in the rest frame of 
the source and the rest frame of the observer. 

Because of the time dilation effect, there will be a second-order Doppler shift 
even if the motion is transverse to the line of sight. For the rest of this discussion, 
we consider only radial velocities; that is, @ = 0 or 180°. In this case, the Doppler 
shift equation is 


(A10.7) 
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where v, is the radial velocity (positive for recession). Solving for velocity, we 
obtain 


2 2 
Ur vor 
== A10.8 
c vty? ( ) 
or 
at (A10.9) 
c X+ a l 
Taylor expansions of Eqs. (A10.8) and (A10.9) yield 
v, Av 14»? 
en E E (A10.10) 
c Vo 2 Vo 
and 
vr AA 14? (A10.11) 
eea oe 
where Av = v — vo and AA = A — Ao. For negative Av, the velocity is 


positive and the signal is “redshifted.” Since Av/vg ~ —AA/Ao, the second-order 
terms have approximately the same magnitude but opposite signs in Eqs. (A10.10) 
and (A10.11). 

Devices for spectroscopy at radio and optical frequencies usually produce data 
that are uniformly spaced in frequency and wavelength, respectively. Hence, to first 
order, the velocity axis can be calculated as a linear transformation of the frequency 
or wavelength axes. Unfortunately, this has led to two different approximations of 
the velocity: 


Urradio Av 
Frradio _ _ AY (A10.12) 
Cc Vo 
and 
Vroptical AÀ 
pict L Oa (A10.13) 
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The difference between these two approximations can be appreciated by noting 
that Upradio/€ = —AA/A. Each velocity scale produces a second-order error in 
its estimation of the true velocity; that is, the radio definition underestimates the 
velocity, and the optical definition overestimates the velocity by the same amount. 
The difference in velocity between the scales as a function of velocity is 


(A10.14) 


ôv, = UVroptical ~ Urradio ~ 


a |S 
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Hence, the identification of the velocity scale used is very important for extragalactic 
sources. For example, if v, = 10,000 km s™!, dv, ~ 330 km s™!. Failure to 
recognize the difference between the velocity conventions can cause considerable 
problems when observations are made with narrow bandwidth. 

To interpret the velocities of spectral lines, it is necessary to refer them to an 
appropriate inertial frame. The rotation velocity of an observer at the equator about 
the Earth’s center is about 0.5 km s7!; the velocity of the Earth around the Sun is 
about 30 km s~!; the velocity of the Sun with respect to the nearby stars is about 
20 km s~! [this defines the local standard of rest (LSR)]; the velocity of the LSR 
around the center of the Galaxy is about 220 km s~!; the velocity of our Galaxy with 
respect to the local group is about 310 km s~!; and the velocity of the local group 
with respect to the CMB radiation is about 630 km s~!. The most accurate reference 
frame beyond the solar system is defined with respect to the CMB. The velocity 
of the Sun with respect to the CMB has been determined from measurements of the 
dipole anisotropy of the CMB (v = cTaipotle/ Temp, Where Taipole = 3364.3+1.5 uK 
and Temp = 2.7255 + 0.0006 K), which yields the remarkably precise result of 
370.1 + 0.1 km s™! toward £ = 263.91° + 0.02° and b = 48.265° + 0.002° 
(Planck Collaboration 2016). Information on these various reference frames is listed 
in Table A10.1. Most observations are reported with respect to either the solar 


Table A10.1 Reference frames for spectroscopic observations 


Motion Direction* 
Name Type of motion (kms~!) £(°) b (°) 
Topocentric Rotation of Earth 0.5 = = 
Geocentric Rotation of Earth around 0.013 = = 
Earth/Moon barycenter 
Heliocentric Rotation of Earth around Sun 30 = = 
Barycentric Rotation of Sun around 0.012 = = 
solar system barycenter 
(planetary perturbations) 
Local standard Sun with respect to 20 57 23 
of rest (LSR)®° local stars 
Galactocentric? LSR around center 220 90 0 
of the Galaxy 
Local Galactic Sun with respect to 308 105 —7 
Standard of rest? Galaxies of the local group 
CMB* Sun with respect to CMB 370 264 48 


“Galactic longitude and latitude. 

Standard value adopted by the IAU in 1985 (see Kerr and Lynden-Bell 1986). See literature for 
more recent determinations. 

°Converted from 20 km s~! toward a = 18", 5 = 30° (1990). See literature for newer 
measurements. 

dCox (2000). 

Planck Collaboration (2016). 
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system barycenter or the LSR. Velocities of stars and galaxies are usually given 
in the former frame, and observations of nonstellar Galactic objects (e.g., molecular 
clouds) are usually given in the latter frame. Accurate determination of the rotation 
speed of the Galaxy and its structure depend on precise knowledge of the LSR. 
Velocity corrections at many radio observatories are based on a program called DOP 
[Ball (1969); see also Gordon (1976)], which has an accuracy of ~0.01 km s~! 
because it does not take planetary perturbations into account. Routines such as 
CVEL in AIPS are based on this code. Much higher accuracy can be obtained 
by more sophisticated programs such as the Planetary Ephemeris Program (Ash 
1972) or the JPL Ephemeris (Standish and Newhall 1996). Precise comparison 
of velocity measurements at different observations requires comparison of their 
dynamical calculations. Interpretation of pulsar timing measurements also requires 
precise velocity correction. 

There is sometimes confusion in the conversion of baseband frequency to 
true observed frequency. In the calculation of the spectrum in the baseband by 
Fourier transformation of either the data stream or the correlation function with 
the FFT algorithm, the first channel corresponds to zero frequency, and the channel 
increment is Avyj;/N, where Avy is the bandwidth (half the Nyquist sampling rate) 
and N is the total number of frequency channels. The Nth channel corresponds to 
frequency Avyp(1 — 1/N). If N is an even number (N is usually a power of two), 
channel N/2 corresponds to the center frequency of the baseband. For a system 
with only upper-sideband conversions, the sky frequency of the first channel (zero 
frequency in the baseband) is the sum of the local oscillator frequencies. Note that 
the velocity axes run in opposite directions (v x —v and v œ v) for systems with 
net upper- and lower-sideband conversion, respectively. 

There are several velocity shifts of non-Doppler origin that sometimes need to 
be taken into account. For spectral lines originating in deep potential wells—for 
example, close to black holes—there is an additional time dilation term 


Yc = ——., (A10.15) 


where r is the distance from the center of the black hole and r, is its Schwarzschild 
radius (r, = 2GM/c’), which is valid for r > r,. The total frequency shift [obtained 
by generalizing Eq. (A10.6)] is therefore 


vo (1 a A Tos 6) YLYG » (A10.16) 
v c 


where yp = 1/y1-— v,?/c? is known as the Lorentz factor. For example, the 
radiation from the water masers in NGC 4258 (see Fig. 1.23), which orbit a black 
hole at a radius of 40,000 r,, undergoes a velocity shift of about 4 km s™!. 

The most important non-Doppler frequency shift for sources at cosmological 
distances is due to the expansion of the Universe. In the relatively nearby Universe, 
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this velocity shift is 


À Hod 
ZS e 
Ào E 


(A10.17) 
where Ho is the Hubble constant and d is the distance. Ho is about 70 km s7! Mpc™! 
(Mould et al. 2000). For greater distances (z > 1), the relations between z and the 
distance and look-back time depend on the cosmological model used [e.g., Peebles 
(1993)]. However, given the definition of z, the correct frequency will always be 
related to it by 

Vo 


yp = —.. A10.18 
z+l ( ) 


Other issues regarding observations of cosmologically distant spectral line 
sources are discussed by Gordon et al. (1992). An early example of spectroscopic 
interferometric observations of a molecular cloud at a cosmological distance (z = 
3.9) can be found in Downes et al. (1999). 


Appendix 10.3 Historical Notes 


A10.3.1 Images from One-Dimensional Profiles 


Early images of the Sun and a few other strong sources were made with linear arrays 
such as the grating array and compound interferometer shown in Fig. 1.13. The 
results were obtained in the form of fan-beam scans. With such an instrument, the 
visibility data sampled at any instant are located on a straight line through the origin 
in the (u, v) plane, as shown in Fig. 10.1. Fourier transformation of the visibility 
data sampled along such a line provides a corrugated surface with a profile given by 
the fan-beam scan, as shown in Fig. A10.2. This can be regarded as one component 
of a two-dimensional image. As the Earth rotates, the angle of the beam on the 
sky varies, so addition of these components builds up a two-dimensional image. 
However, in the fan-beam scans from such arrays, each pair of antennas contributes 
with equal weight to the profile, so an image built up from profiles in such a manner 
exhibits the undesirable characteristics of natural weighting. During the 1950s, 
before digital computers were generally available, the combination of such data to 
provide two-dimensional images with a desirable weighting was a laborious process. 
Christiansen and Warburton’s (1955) solar image involved Fourier transformation, 
weighting, and retransformation of the data by manual calculation. A method of 
combining fan-beam scans without Fourier transformation was later devised by 
Bracewell and Riddle (1967) using convolution to adjust the visibility weighting. 
Basic relationships between one- and two-dimensional responses (Bracewell 1956a) 
are discussed in Sect. 2.4. 
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Fig. A10.2 A surface in the m 

(l, m) domain that is the 

Fourier transform of visibility VA 
data in the (u, v) plane 

measured along a line making 

an angle @ + 2/2 with the u 

axis, as shown by the 

broken line in Fig. 10.1. ~— = = 


f) 
f 


A10.3.2 Analog Fourier Transformation 


An optical lens can be used as an analog device for Fourier transformation. Analog 
systems for data processing based on optical, acoustic, or electron-beam processes 
were investigated in the early years but generally have not proved successful for 
synthesis imaging. They lacked flexibility, and a further problem was limitation 
of the dynamic range, which is the ratio of the highest intensity levels to the 
noise in the image. Maintaining image quality in any iterative process that involves 
successive Fourier transformation and retransformation of the same data, as occurs 
in some deconvolution processes (see Chap. 11), requires high precision. Analog 
possibilities for Fourier transformation were discussed by Cole (1979) but became 
irrelevant as more powerful computers became available. 
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Chapter 11 
Further Imaging Techniques 


This chapter is concerned with techniques of processing that are largely nonlinear 
and include deconvolution, that is removing, to the extent possible, the limitations 
of the visibility measurements. There are two principal deficiencies in the visibility 
data that limit the accuracy of synthesis images. These are (1) the limited distribu- 
tion of spatial frequencies in u and v and (2) errors in the visibility measurements. 
The limited spatial frequency coverage can be improved by deconvolution processes 
such as CLEAN that allow the unmeasured visibility to take nonzero values 
within some general constraints on the image. Calibration can be improved by 
adaptive techniques in which the antenna gains, as well as the required image, are 
derived from the visibility data. Wide-field imaging, multifrequency imaging, and 
compressed sensing are also discussed. 


11.1 The CLEAN Deconvolution Algorithm 


One of the most successful deconvolution procedures is the algorithm CLEAN 
devised by Hégbom (1974). This is basically a numerical deconvolving process 
usually applied in the image (/,m) domain. It has become an essential tool in 
producing images from incomplete (u, v) data sets. The procedure is to break down 
the intensity distribution into point-source responses that correspond to the original 
imaging process, and then replace each one with the corresponding response to a 
beam that is free of sidelobes. CLEAN can be thought of as a type of compressed 
sensing (see Sect. 11.8.6). 
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11.1.1 CLEAN Algorithm 


The principal steps are as follows. 


1. Compute the image and the response to a point source by Fourier transformation 
of the visibility and the weighted transfer function. These functions, the syn- 
thesized intensity and the synthesized beam, are often referred to as the “dirty 
image” and the “dirty beam,” respectively. The spacing of the sample points in 
the (l, m) plane should not exceed about one-third of the synthesized beamwidth. 

2. Find the highest intensity point in the image and subtract the response to a point 
source, i.e., the dirty beam, including the full sidelobe pattern, centered on that 
position. The peak amplitude of the subtracted point-source response is equal to 
y times the corresponding image amplitude. y is called the loop gain, by analogy 
with negative feedback in electrical systems, and commonly has a value of a few 
tenths. Record the position and amplitude of the component removed by inserting 
a delta-function component into a model that will become the cleaned image. 

3. Return to step 2 and repeat the procedure iteratively until all significant source 
structure has been removed from the image. There are several possible indicators 
of this condition. For example, one can compare the highest peak with the rms 
level of the residual intensity, look for the first time that the rms level fails 
to decrease when a subtraction is made, or note when significant numbers of 
negative components start to be removed. 

4. Convolve the delta functions in the cleaned model with a clean-beam response, 
that is, replace each delta function with a clean-beam function of corresponding 
amplitude. The clean beam is often chosen to be a Gaussian with a half-amplitude 
width equal to that of the original synthesized (dirty) beam, or some similar 
function that is free from negative values. 

5. Add the residuals (the residual intensity from step 3) into the clean-beam image, 
which is then the output of the process. (When the residuals are added, the Fourier 
transform of the image is equal to the measured visibilities.) 


It is assumed that each dirty-beam response that is subtracted represents the 
response to a point source. As discussed in Sect. 4.4, the visibility function of a 
point source is a pair of real and imaginary sinusoidal corrugations that extend to 
infinity in the (u, v) plane. Any intensity feature for which the visibility function is 
the same within the (u,v) area sampled by the transfer function would produce 
a response in the image identical to the point-source response. Högbom (1974) 
has pointed out that much of the sky is a random distribution of point sources 
on an empty background, and CLEAN was initially developed for this situation. 
Nevertheless, experience shows that CLEAN also works on well-extended and 
complicated sources. 

The result of the first three steps in the CLEAN procedure outlined above can 
be represented by a model intensity distribution that consists of a series of delta 
functions with magnitudes and positions representing the subtracted components. 
Since the modulus of the Fourier transform of each delta function extends uniformly 
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to infinity in the (u, v) plane, the visibility is extrapolated as required beyond the 
cutoff of the transfer function. 

The delta-function components do not constitute a satisfactory model for astro- 
nomical purposes. Groups of delta functions with separations no greater than the 
beamwidth may actually represent extended structure. Convolution of the delta- 
function model by the clean beam, which occurs in step 4, removes the danger 
of overinterpretation. Thus, CLEAN performs, in effect, an interpolation in the 
(u,v) plane. Desirable characteristics of a clean beam are that it should be free 
from sidelobes, particularly negative ones, and that its Fourier transform should be 
constant inside the sampled region of the (u, v) plane and rapidly fall to a low level 
outside it. These characteristics are essentially incompatible since a sharp cutoff in 
the (u, v) plane results in oscillations in the (/,m) plane. The usual compromise is 
a Gaussian beam, which introduces a Gaussian taper in the (u, v) plane. Since this 
function tapers the measured data and the unmeasured data generated by CLEAN, 
the resulting intensity distribution no longer agrees with the measured visibility data. 
However, the absence of large, near-in sidelobes improves the dynamic range of the 
image, that is, it increases the range of intensity over which the structure of the 
image can reliably be measured. 

As discussed in Chap. 10, we cannot directly divide out the weighted spatial 
transfer function on the right side of Eq.(10.4) because it is truncated to zero 
outside the areas of measurement. In CLEAN, this problem is solved by analyzing 
the measured visibility into sinusoidal visibility components and then removing the 
truncation so that they extend over the full (u, v) plane. Selecting the highest peak in 
the (l,m) plane is equivalent to selecting the largest complex sinusoid in the (u, v) 
plane. 

At the point that the component subtraction is stopped, it is generally assumed 
that the residual intensity distribution consists mainly of the noise. Retaining the 
residual distribution within the image is, like the convolution with the clean beam, a 
nonideal procedure that is necessary to prevent misinterpretation of the final result. 
Without the residuals added in step 5, there would be an amplitude cutoff in the 
structure corresponding to the lowest subtracted component. Also, the presence of 
the background fluctuations provides an indication of the level of uncertainty in the 
intensity values. An example of the effect of processing with the CLEAN algorithm 
is shown in Fig. 11.1. 


11.1.2 Implementation and Performance of the CLEAN 
Algorithm 


As a procedure for removing sidelobe responses, CLEAN is easy to understand. 
Being highly nonlinear, however, CLEAN does not yield readily to a complete 
mathematical analysis. Some conclusions have been derived by Schwarz (1978, 
1979), who has shown that conditions for convergence of CLEAN are that the 
synthesized beam must be symmetrical and its Fourier transform, that is, the 
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Fig. 11.1 Illustration of the CLEAN procedure using observations of 3C224.1 at 2695 MHz made 
with the interferometer at Green Bank, and rather sparse (u, v) coverage. (a) The synthesized 
“dirty” image; (b) the image after one iteration with the loop gain y = 1; (c) after two iterations; 
(d) after six iterations. The components removed were restored with a clean beam in all cases. 
The contour levels are 5, 10, 15, 20, 30, etc., percent of the maximum value. From J. A. H6gbom 
(1974), reproduced with permission. © ESO. 


weighted transfer function, must be nonnegative. These conditions are fulfilled in 
the usual synthesis procedure. Schwarz’s analysis also indicates that if the number 
of delta-function components in the CLEAN model does not exceed the number 
of independent visibility data, CLEAN converges to a solution that is the least- 
mean-squares fit of the Fourier transforms of the delta-function components to the 
measured visibility. In enumerating the visibility data, either the real and imaginary 
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parts or the conjugate values (but not both) are counted independently. In images 
made using the fast Fourier transform (FFT) algorithm, there are equal numbers 
of grid points in the (u, v) and (l, m) planes, but not all (u, v) grid points contain 
visibility measurements. To maintain the condition for convergence, it is a common 
procedure to apply CLEAN only within a limited area, or “window,” of the original 
image. 

In order to clean an image of a given dimension, it is necessary to have a dirty 
beam pattern of twice the image dimensions so that a point source can be subtracted 
from any location in the image. However, it is often convenient for the image and 
beam to be the same size. In that case, only the central quarter of the image can 
be properly processed. Thus, it is commonly recommended that the image obtained 
from the initial Fourier transform should have twice the dimensions required for the 
final image. As mentioned above, the use of such a window also helps to ensure 
that the number of components removed does not exceed the number of visibility 
data and, in the absence of noise, allows the residuals within the window area to 
approach zero. 

Several arbitrary choices influence the result of the CLEAN process. These 
include the parameter y, the window area, and the criterion for termination. Note 
that a point-source component in the image can be removed in one step of CLEAN 
only if it is centered on an image cell. This is an important reason for choosing 
y < 1. A value between 0.1 and 0.5 is usually assigned to y, and it is a matter 
of general experience that CLEAN responds better to extended structure if the loop 
gain is in the lower part of this range. The computation time for CLEAN increases 
rapidly as y is decreased, because of the increasing number of subtraction cycles 
required. If the signal-to-noise ratio is Rsn, then the number of cycles required for 
one point source is — log Rsn/ log(1 — y). Thus, for example, with Ren = 100 and 
y = 0.2, a point source requires 21 cycles. 

A well-known problem of CLEAN is the generation of spurious structure in the 
form of spots or ridges as modulation on broad features. A heuristic explanation 
of this effect is given by Clark (1982). The algorithm locates the maximum in the 
broad feature and removes a point-source component, as shown in Fig. 11.2. The 
negative sidelobes of the beam add new maxima, which are selected in subsequent 
cycles, and thus, there is a tendency for the component subtraction points to be 
located at intervals equal to the spacing of the first sidelobe of the synthesized (dirty) 
beam. The resulting image contains a lumpy artifact introduced by CLEAN, but the 
image is consistent with the measured visibility data. Cornwell (1983) introduced 
a modification of the CLEAN algorithm that is intended to reduce this unwanted 
modulation. The original CLEAN algorithm minimizes 


X wel Vee — Vel? , (11.1) 
k 


where V°% is the measured visibility at (ux, vg), wx is the applied weighting, and 
y monel is the corresponding visibility of the CLEAN-derived model. The summation 
is taken over the points with nonzero data in the input transformation for the dirty 
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Fig. 11.2 Subtraction of the point-source response (broken line) at the maximum of a broad 
feature, as in the process CLEAN. Adapted from Clark (1982). 


image. Cornwell ’s algorithm minimizes 


X wel Ver — Vee? — Ks , (11.2) 
k 


where s is a measure of smoothness, and « is an adjustable parameter. Cornwell 
found that the mean-squared intensity of the model, taken with a negative sign, is an 
effective implementation of s. 

The effects of visibility tapering appear in both the original image and the 
beam, and thus the magnitudes and positions of the components subtracted in the 
CLEAN process should be largely independent of the taper. However, since tapering 
reduces the resolution, it is a common practice to use uniform visibility weighting 
for images that are processed using CLEAN. Alternately, in difficult cases such as 
those involving extended smooth structure, reduction of sidelobes by tapering may 
improve the performance of CLEAN. 

An important reduction in the computation required for CLEAN was introduced 
by Clark (1980). This is based on subtraction of the point-source responses in 
the (u, v) plane and using the FFT for moving data between the (u, v) and (l, m) 
domains. The procedure consists of minor and major cycles. A series of minor 
cycles is used to locate the components to be removed by performing approximate 
subtractions using only a small patch of the synthesized dirty beam that includes 
the main beam and the major sidelobes. Then in a major cycle, the identified point- 
source responses are subtracted, without approximation, in the (u,v) plane. That 
is, the convolution of the delta functions with the dirty beam is performed by 
multiplying their Fourier transforms. The series of minor and major cycles is then 
repeated until the required stop condition is reached. Clark devised this technique 
for use with data from the VLA and found that it reduced the computation by a 
factor of two to ten compared with the original CLEAN algorithm. 

Other variations on the CLEAN process have been devised; one of the more 
widely used is the Cotton—Schwab algorithm [Schwab (1984); see Sect. IV], which 
is a variation of the Clark algorithm. The subtractions in the major cycle are 
performed on the ungridded visibility data, which eliminates aliasing at this point. 
The algorithm is also designed to permit processing of adjacent fields, which are 
treated separately in the minor cycles but in the major cycles, components are jointly 
removed from all fields. 

To summarize the characteristics of CLEAN, we note that it is simple to 
understand from a qualitative viewpoint and straightforward to implement and that 
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its usefulness is well proven. On the other hand, a full analysis of its response 
is difficult. The response of CLEAN is not unique, and it can produce spurious 
artifacts. It is sometimes used in conjunction with model-fitting techniques; for 
example, a disk model can be removed from the image of a planet and the residual 
intensity processed by CLEAN. A more stable and efficient version of CLEAN 
called multiscale CLEAN has been developed for extended objects (Wakker and 
Schwarz 1988; Cornwell 2008). The basic idea is that broad emission components 
are identified first and removed. More sophisticated methods are being developed 
to handle extended emission [e.g., Junklewitz et al. (2016)]. CLEAN is also used as 
part of more complex image construction techniques. For more details, including 
hints on usage, see Cornwell et al. (1999), and for extended objects, Cornwell 
(2008). 


11.2 Maximum Entropy Method 


11.2.1 MEM Algorithm 


An important class of image-restoration algorithms operates to produce an image 
that agrees with the measured visibility to within the noise level, while constraining 
the result to maximize some measure of image quality. Of these, the maximum 
entropy method (MEM) has received particular attention in radio astronomy. If 
I'(1,m) is the intensity distribution derived by MEM, a function F(T’) is defined, 
which is referred to as the entropy of the distribution. F(/’) is determined entirely 
by the distribution of /’ as a function of solid angle and takes no account of structural 
forms within the image. In constructing the image, F(T’) is maximized within the 
constraint that the Fourier transform of /’ should fit the observed visibility values. 
In astronomical image formation, an early application of MEM is that of 
Frieden (1972) to optical images. In radio astronomy, the earliest discussions are 
by Ponsonby (1973) and Ables (1974). The aim of the technique, as described by 
Ables, is to obtain an intensity distribution consistent with all relevant data but 
minimally committal with regard to missing data. Thus, F(/’) must be chosen so 
that maximization introduces legitimate a priori information but allows the visibility 
in the unmeasured areas to assume values that minimize the detail introduced. 
Several forms of F(T’) have been used, which include the following: 


T r 
Re -Z five (+) (11.32) 


F,=—) logi; (11.3b) 


r 
F; = -J iin (x) . (11.3c) 
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where I = I(l, m;), I. = $; I}, M; represents an a priori model, and the sums are 
taken over all pixels, J;, in the image. F3 can be described as relative entropy, since 
the intensity values are specified relative to a model. 

A number of papers discuss the derivation of the expressions for entropy from 
theoretical and philosophical considerations. Bayesian statistics are invoked: see 
Jaynes (1968, 1982). Gull and Daniell (1979) consider the distributions of intensity 
quanta scattered randomly on the sky, and they derive the form F1, which is also used 
by Frieden (1972). The entropy form F3 is obtained by Ables (1974) and Wernecke 
and D’ Addario (1977). Other investigators take a pragmatic approach to MEM 
(Hégbom 1979, Subrahmanya 1979, Nityananda and Narayan 1982). They view the 
method as an effective algorithm, even though there may be no underlying physical 
or information-theoretical basis for the choice of constraints. H6gbom (1979) points 
out that both F; and Fz contain the required mathematical characteristics: the 
first derivatives tend to infinity as I’ approaches zero, so maximizing F, or F> 
produces positivity in the image. The second derivatives are everywhere negative, 
which favors uniformity in the intensity. Narayan and Nityananda (1984) consider a 
general class of functions F that have the properties dF /dI’? < 0 and d?F/dI’> > 0. 
F and F>, discussed above, are members of this class. 

In the maximization of the entropy expression F(T’), the constraint that the 
resulting intensity model should be consistent with the measured visibility data is 
implemented through a x? statistic. Here, y? is a measure of the mean-squared 
difference between the measured visibility values, Vi" = V(ux, vg), and the 
corresponding values for the model ye 


| cpimeas _ eymodel j2 
pep a-a L, (11.4) 
k Ok 


where of is the variance of the noise in V;"***, and the summation is taken 
over the visibility data set. Obtaining a solution involves an iterative procedure; 
see Wernecke and D’ Addario (1977), Wernecke (1977), Gull and Daniell (1978), 
Skilling and Bryan (1984), and a review by Narayan and Nityananda (1984). As an 
example, Cornwell and Evans (1985) maximize a parameter J given by 


J= F3 = ary? = BSmodel ; (11.5) 


where F; is defined in Eq. (11.3c). Smodei is the total flux density of the model and 
is included because in order for the process to converge to a satisfactory result, it 
was found necessary to include a constraint that the total flux density of the model 
be equal to the measured flux density. Lagrange multipliers œ and 6 are included, 
the values of which are adjusted as the model fitting proceeds so that y? and Spode 
are equal to the expected values. Through the use of F3, a priori information can be 
introduced into the final image. The various algorithms that have been developed 
for implementing MEM generally use the gradients of the entropy and of x? to 
determine the adjustment of the model in each iteration cycle. 
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A feature of images derived by MEM is that the point-source response varies 
with position, so the angular resolution is not constant over the image. Comparison 
of maximum entropy images with those obtained using direct Fourier transformation 
often shows higher angular resolution in the former. The extrapolation of the 
visibility values can provide some increase in resolution over more conventional 
imaging techniques. 


11.2.2 Comparison of CLEAN and MEM 


CLEAN is defined in terms of a procedure, so the implementation is straightforward, 
but because of the nonlinearity in the processing, a noise analysis of the result is 
very difficult. In contrast, MEM is defined in terms of an image that fits the data to 
within the noise and is also constrained to maximize some parameter of the image. 
The noise in MEM is taken into account through the y? statistic, and the resulting 
effect on the noise is more easily analyzed for MEM; see, for example, Bryan and 
Skilling (1980). Some further points of comparison are as follows: 


¢ Implementation of MEM requires an initial source model, which is not necessary 
in CLEAN. 

e CLEAN is usually faster than MEM for small images, but MEM is faster for 
very large images. Cornwell et al. (1999) give the break-even point as about 10° 
pixels. 

e CLEAN images tend to show a small-scale roughness, attributable to the basic 
approach of CLEAN, which models all images as ensembles of point sources. In 
MEM, the constraint in the solution emphasizes smoothness in the image. 

e Broad, smooth features are better deconvolved using MEM, since CLEAN may 
introduce stripes and other erroneous detail. MEM does not perform well on 
point sources, particularly if they are superimposed on a smooth background 
that prevents negative sidelobes from appearing as negative intensity in the dirty 
image. 


To illustrate the characteristics of the CLEAN and MEM procedures, Fig. 11.3 
shows examples of processing of a model jet structure from Cornwell (1995) and 
Cornwell et al. (1999), using model calculations by Briggs. The jet model is based 
on similar structure in M87 and is virtually identical to the contour levels shown 
in part (e). The left end of the jet is a point source smoothed to the resolution 
of the simulated observation. Visibility values for the model corresponding to the 
(u, v) coverage of the VLBA (Napier et al. 1994) were calculated for a frequency of 
1.66 GHz and a declination of 50° with essentially full tracking range. Thermal 
noise was added, but the calibration was assumed to be fully accurate. Fourier 
transformation of the visibility data and the spatial transfer function provided the 
dirty image and dirty beam. The image shows the basic structure, but fine details are 
swamped by sidelobes. Parts (a) to (c) of Fig. 11.3 show the effects of processing by 
CLEAN. In the CLEAN deconvolution, 20,000 components were subtracted with a 
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loop gain of 0.1. Part (a) shows the result of application of CLEAN to the whole 
image, and part (b) shows the result when components are taken only within a tight 
support region surrounding the source (the technique sometimes referred to as use 
of a box or window). Note the improvement obtained in (b), which is a result of 
adding the information that there is no emission outside the box region. The contours 
approximately indicate the intensity increasing in powers of two from a low value 
of 0.05%. Part (c) shows the same image as panel (b) but with contours starting 
a factor of ten lower in intensity. The roughness visible in the low-level contours 
is characteristic of CLEAN, in which each component is treated independently 
and there is no mechanism to relate the result for any one component to those 
for its neighbors, unlike the case of MEM, in which a smoothness constraint is 
introduced. Parts (d) to (f) result from MEM processing. Part (d) shows the result 
of MEM deconvolution using the same constraint region as in panel (b) and 80 
iterations. The circular pattern of the background artifacts, centered on the point 
source, clearly shows that MEM does not handle such a feature well. In part (e), the 
point source was subtracted, using the CLEAN response to the feature, and then the 
MEM deconvolution performed with the same constraint region as in (d). The source 
was then replaced. Part (f) shows the same response as (e) with the lowest contours 
at the same level as panel (c). The low-level contours show the structure contributed 
by the observation and processing. The contours are smoother in the MEM image 
than in the CLEAN one. The images in (c) and (f) have comparable fidelity, that is, 
accuracy of reproduction of the initial model. Combinations of procedures, such as 
the use of CLEAN to remove point-source responses from an image and then the 
use of MEM to process the broader background features can sometimes be used to 
advantage in complex images. 


11.2.3 Further Deconvolution Procedures 


Briggs (1995) has applied a nonnegative, least-squares (NNLS) algorithm for 
deconvolution. The NNLS algorithm was developed by Lawson and Hanson (1974) 
and provides a solution to a matrix equation of the form AX = B, where, in the 
radio astronomy application, A represents the dirty beam and B the dirty image. 
The algorithm provides a least-mean-squares solution for the intensity X that is 
constrained to contain no negative values. However, unlike the case for MEM, there 
is no smoothness criterion involved. The NNLS solution requires more computer 
capacity than CLEAN or MEM solutions, but Briggs’s investigation indicated that 
it is capable of superior performance, particularly in cases of compact objects of 
width only a few synthesized beamwidths. NNLS was found to reduce the residuals 
to a level close to the system noise in the observations. In certain cases, it was 
found to work more effectively than CLEAN in hybrid imaging and self-calibration 
procedures (discussed in Sect. 11.3) and to allow higher dynamic range to be 
achieved. In MEM, the residuals may not be entirely random but may be correlated 
in the image plane, and this effect can introduce bias in the (u, v) data that limits the 
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achievable dynamic range. CLEAN appears to behave somewhat similarly unless it 
is allowed to run long enough to work down into the noise. Some further discussion 
can be found in Briggs (1995) and Cornwell et al. (1999). 


11.3 Adaptive Calibration and Imaging 


Calibration of the visibility amplitude is usually accurate to a few percent or better, 
but phase errors expressed as a fraction of a radian may be much larger, sometimes 
as a result of variations in the ionosphere or troposphere. Nevertheless, the relative 
values of the uncalibrated visibility measured simultaneously on a number of 
baselines contain information about the intensity distribution that can be extracted 
through the closure relationships described in Chap. 10, Eqs. (10.34) and (10.44). 
Following Schwab (1980), we use the term adaptive calibration for both the hybrid 
imaging and self-calibration techniques that make use of this information. Imaging 
with amplitude data only has also been investigated and is briefly described. 


11.3.1 Hybrid Imaging 


The rekindling of interest in closure techniques in the 1970s began with their 
rediscovery by Rogers et al. (1974), who used closure phases to derive model 
parameters for VLBI data. Fort and Yee (1976) and several later groups incorporated 
closure data into iterative imaging techniques, of which that by Readhead et al. 
(1980) is as follows: 


1. Obtain an initial trial image based on inspection of visibility amplitudes and any 
a priori data such as an image at a different wavelength or epoch. If the trial 
image is inaccurate, the convergence will be slow, but if necessary, an arbitrary 
trial image such as a single point source will often suffice. 

2. For each visibility integration period, determine a complete set of independent 
amplitude and/or phase closure equations. For each such set, compute a sufficient 
number of visibility values from the model such that when added to the closure 
relationships, the total number of independent equations is equal to the number 
of antenna spacings. 

3. Solve for the complex visibility corresponding to each antenna spacing and make 
an image from the visibility data by Fourier transformation. 

4. Process the image from step 3 using CLEAN but omitting the residuals. 

5. Apply constraints for positivity and confinement (delete components having 
negative intensity or lying outside the area that is judged to contain the source). 

6. Test for convergence and return to step 2 as necessary, using the image from 
step 5 as the new model. 


Note that the solution improves with iteration because of the constraints of 
confinement and positivity introduced in step 5. These nonlinear processes can be 
envisioned as spreading the errors in the model-derived visibility values throughout 
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the visibility data, so that they are diluted when combined with the observed values 
in the next iterative cycle. 

In the process described, and most variants of it, the image is formed by using 
some data from the model and some from direct measurements, and following 
Baldwin and Warner (1978), the term hybrid imaging (or mapping) is sometimes 
used as a generic description. With the use of phase closure, there is no absolute 
position measurement, but there is no ambiguity with respect to the position angle 
of the image. With the use of amplitude closure, only relative levels of intensity are 
determined, but it is usually not difficult to calibrate enough of the data to establish 
an intensity scale. In many cases, the amplitude data are sufficiently accurate as 
observed, and only the phase closure relationships need be used; Readhead and 
Wilkinson (1978) have described a version of the above program using phase 
closure only. Other versions of this technique, which differ mainly in detail of 
implementation from that described, have been developed by Cotton (1979) and 
Rogers (1980). If there is some redundancy in the baselines, the number of free 
parameters is reduced, which can be advantageous, as discussed by Rogers. 

The number of antennas, na, is obviously an important factor in imaging 
procedures that make use of the closure relationships, since it affects the efficiency 
with which the data are used. We can quantify this efficiency by considering the 
number of closure data as a fraction of the number of data that would be available 
if full calibration were possible, as a function of na. The numbers of independent 
closure data are given by Eqs. (10.42) and (10.45). The number of data with full 
calibration is equal to the number of baselines, which, if we assume there is no 
redundancy, is 5Ma(Na — 1). For the phase data, the fraction is 


5 (Na = 1) (na = 2) _ Na —2 


11.6 
5Na(Na — 1) Na ( ) 
For the amplitude data, the fraction is 
Ina Ng — 3 = 3 
zal ) _ Ma (11.7) 


$Na(Na = 1) Ng — 1 


These fractions are also equal to the ratios of observed data to observed plus 
model-derived data in each iteration of the hybrid imaging procedure. Equa- 
tions (11.6) and (11.7) are plotted in Fig. 11.4. For na = 4, the closure relationships 
yield only 50% of the possible phase data and only 33% of the amplitude data. For 
Na = 10, however, the corresponding figures are 80% and 78%. Thus, in any array in 
which the atmosphere or instrumental effects may limit the accuracy of calibration 
by a reference source, it is desirable that the number of antennas should be at least 
ten and preferably more. The number of iterations required to obtain a solution 
with the hybrid technique depends on the complexity of the source, the number of 
antennas, the accuracy of the initial model, and other factors including details of the 
algorithm used. 
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11.3.2 Self-Calibration 


Hybrid imaging has largely been superseded by a more general approach called 
self-calibration. Here, the complex antenna gains are regarded as free parameters 
to be explicitly derived together with the intensity. In certain cases, the process is 
easily explained. For example, in imaging an extended source containing a compact 
component (as in many radio galaxies), the broad structure is resolved with the 
longer antenna spacings, leaving only the compact source. This can be used as a 
calibrator to provide the relative phases of the long-spacing antenna pairs, but not 
the absolute phase since the position is not known. Then, if there is a sufficient 
number of long spacings in the array, the relative gain factors of the antennas can 
be obtained using long spacings only. Such a special intensity distribution, however, 
is not essential to the method, and with an iterative technique, it is possible to use 
almost any source as its own calibrator. Programs of this type were developed by 
Schwab (1980) and by Cornwell and Wilkinson (1981). Reviews of the techniques 
are given by Pearson and Readhead (1984) and Cornwell (1989). 

The procedure in self-calibration is to use a least-mean-squares method to 
minimize the square of the modulus of the difference between the observed 
visibilities, V"°*s, and the corresponding values for the derived model, model, 


mn 
The expression that is minimized is 


yD el a T: (11.8) 


time m<n 


where the weighting coefficient wmn is usually chosen to be inversely proportional 
to the variance of Vyp, and the quantities shown are all functions of time within 


the observing period. Expression (11.8) can be written 


N Y wml VP Xan — gmg? , (11.9) 


time m<n 
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where 


emeas 
Xm = TA l (11.10) 


mn 


If the model is accurate, the ratio Xmn of the uncalibrated observed visibility to 
the visibility predicted by the model is independent of u and v but proportional 
to the antenna gains. Thus, the values of Xmn simulate the response to a calibrator 
and enable the gains to be determined. However, since the initial model is only 
approximate, the desired result must be approached by iteration. 

The self-calibration procedure is: 


. Make an initial image as for hybrid imaging. 

. Compute the X,,,, factors for each visibility integration period within the obser- 

vation. 

. Determine the antenna gain factors for each integration period. 

. Use the gains to calibrate the observed visibility values and make an image. 

5. Use CLEAN and select components to provide positivity and confinement of the 
image; Cornwell (1982) recommends omitting all features for which |/(/, m)| is 
less than that for the most negative feature. 

6. Test for convergence and return to step 2 as necessary. 


Ne 


RW 


The numbers of independent data used in the procedure above are, as in the case 
of hybrid imaging, equal to the numbers of independent closure relationships given 
in Eqs. (10.45) and (10.36), that is, 4n,(n, — 3) for amplitude and ia — 1)(na— 2) 
for phase. The two procedures, hybrid imaging and self-calibration, are basically 
equivalent but differ in details of approach and implementation. The efficiency as 
a function of the number of antennas (Fig. 11.4) applies to both. Examples of the 
performance of the self-calibration technique are shown in Figs. 11.5 and 11.6. 

Treating the gain factors, which are the fundamental unknown quantities, as 
free parameters as in self-calibration is a rather more direct approach than that 
of hybrid imaging. A global estimate of the instrumental factors is obtained using 
the entire data set. Cornwell (1982) points out that it is easier to deal correctly 
with the noise when considering complex visibility as a vector quantity, as in self- 
calibration, than when considering amplitude and phase separately, as in hybrid 
imaging. The noise combines additively in the vector components, resulting in a 
Gaussian distribution, whereas in the amplitude and phase, the more complicated 
Rice distributions of Eqs. (6.63) result. Cornwell and Wilkinson (1981) have devel- 
oped a form of adaptive calibration that takes account of the different probability 
distributions of the amplitude and phase fluctuations, including system noise, for 
the different antennas. It has been used with the MERLIN array of Jodrell Bank, 
UK, which incorporates antennas of different sizes and designs (Thomasson 1986). 
The probability distributions of the antenna-associated errors are legitimate a priori 
information, which can be empirically determined for an array. 

Experience shows that adaptive calibration techniques in many cases converge to 
a satisfactory result using only a single point source as a starting model, although 


11.3 Adaptive Calibration and Imaging 567 


Declination (1950.0) 


11°30'00" + 


45" 


Ul a ee es 
15"4g™248 235 228 215 205 195 
Right Ascension (1950.0) 


Fig. 11.5 Effect of self-calibration on a VLA radio image of the quasar 1548+115. (a) Image 
obtained by normal calibration techniques, which has spurious detail at the level of 1% of the peak 
intensity. (b) Image obtained by the self-calibration technique, in which the level of spurious detail 
is reduced below the 0.2% level. In both (a) and (b), the lowest contour level is 0.6%. © 1983 
IEEE. Reprinted, with permission, from Napier et al. (1983). 


inaccuracy in the initial model increases the number of iterative cycles required. A 
point source is a good model for the phase of a symmetrical intensity distribution 
but may be a poor model for the amplitude. It must also be remembered that the 
accuracy of the closure relationships depends on the accuracy of the matching of 
the frequency responses and polarization parameters from one antenna to another, 
as discussed in Sects. 7.3 and 7.4. In general, any effect that cannot be represented 
by a single gain factor for each antenna degrades the closure accuracy. 
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Fig. 11.6 Three stages in the reduction of the observation of Cygnus A shown in Fig. 1.18. The top 
image is the result of transformation of the calibrated visibility data using the FFT algorithm. The 
calibration source was approximately 3° from Cygnus A. The center image shows reduction using 
the MEM algorithm. This compensates principally for the undersampling in the spatial frequencies 
and thereby removes sidelobes from the synthesized beam. The result is similar to that obtainable 
using the CLEAN algorithm. The bottom image shows the effect of the self-calibration technique, 
in which the maximum entropy image is used as the initial model. The final step improves the 
dynamic range by a factor of 3. In observations in which the initial calibration is not as good as 
in this case, self-calibration usually provides a greater improvement. The long dimension of the 
field is 2.1’ and contains approximately 1000 pixels. Courtesy R. A. Perley, J. W. Dreher, and J. J. 
Cowan. Reproduced by permission of and © NRAO/AUI. 


In using adaptive calibration techniques, the integration period of the data must 
not be longer than the coherence time of the phase variations; otherwise, the 
visibility amplitude may be reduced. The coherence time may be governed by the 
atmosphere, for which the timescale is of the order of minutes (see Sect. 13.4). In 
order for the imaging procedure to work, the field under observation must contain 
structure fine enough to provide a phase reference and bright enough to be detected 
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with satisfactory signal-to-noise ratio within the coherence time. Thus, adaptive 
calibration does not solve all problems and cannot be used for the detection of a 
very weak source in an otherwise empty field. 


11.3.3 Imaging with Visibility Amplitude Data Only 


A number of early studies were made concerning the feasibility of producing images 
using only the amplitude values of the visibility. The Fourier transform of the 
squared modulus of the visibility is equal to the autocorrelation of the intensity 
distribution, [ x x I: 


|V (u, v)|? = V(u, v) V* (u, v) <—> I(l, m) x *I(1,m) . (11.11) 


The right side can also be written as a convolution: I(l, m) * * I(—l,—m). The 
problem of imaging with |V| only is mainly one of interpreting an image of the 
autocorrelation of J. Without phase data, the position of the center of the field cannot 
be determined, and there is a 180° rotational ambiguity in the position angle of the 
image. 

Examples of studies relevant to imaging without phase data are found in 
Bates (1969, 1984), Napier (1972), and Fienup (1978). Napier and Bates (1974) 
review some of the results. The positivity requirement is generally found to be 
insufficient to provide unique solutions for one-dimensional profiles, but for two- 
dimensional images, uniqueness is obtained in some cases (Bruck and Sodin 1979). 
Baldwin and Warner (1978, 1979) considered a two-dimensional distribution of 
sources, with some success in producing a source image from the autocorrelation 
function. Although these approaches showed some promise of providing useful 
interpretation of radio interferometer data, they have not been widely used. More 
importantly, the development of techniques that make use of closure relationships, 
such as hybrid imaging and self-calibration, has allowed visibility phases to 
contribute useful data even when not well calibrated. 


11.4 Imaging with High Dynamic Range 


The dynamic range of an image is usually defined as the ratio of the maximum 
intensity to the rms level at some part of the field where the background is mainly 
blank sky. This rms level is assumed to indicate the lowest measurable intensity. The 
term image fidelity is used to indicate the degree to which an image is an accurate 
representation of a source on the sky. Image fidelity is not directly measurable on an 
actual source, but simulation of an observation of a model source and reduction of 
the visibility data allow comparison of the resulting image with the model. This is a 
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way of investigating antenna configurations, processing methods, and other details. 
The requirements and techniques are discussed in detail by Perley (1989, 1999a). 

High dynamic range requires high accuracy in calibration, removal of any 
erroneous data, and careful deconvolution. That is, it requires high accuracy in the 
visibility measurements and very good (u, v) coverage. A phase error Ad can be 
regarded as introducing an erroneous component of relative amplitude, sin Ad, into 
the visibility data, in phase quadrature to the true visibility. An amplitude error of 
£a% can be regarded as introducing an error component of relative amplitude £4% 
into the visibility. Thus, for example, a phase error of 10° introduces as large an error 
component as does an amplitude error of 17%. An amplitude error of 17% would be 
considered unusually large in most cases, except in conditions of strong atmospheric 
attenuation. However, a 10° phase error would be more commonly encountered, 
especially at frequencies in which ionospheric or tropospheric irregularities are 
important. A phase error A@ (rad) in a correlator output introduces an error 
component of rms relative amplitude A? / V2 in the resulting image. With similar 
errors in na(na — 1)/2 baselines, the dynamic range of a snapshot is limited to 
~ nal Ad. 

Use of self-calibration is an essential step in minimizing gain errors. However, 
after calibration of the antenna-based gain factors, there remain small baseline- 
based terms that can also be calibrated. These result from variations, from one 
antenna to another, in the frequency passband or the polarization, and similar 
effects. Note that in arrays with very high sensitivity at the longer wavelengths, 
the requirement to observe down to the limit set by system noise, in the presence 
of background sources, places a lower limit on the required dynamic range. A 
large number of array elements is helpful in distinguishing individual sources 
(Lonsdale et al. 2000). Braun (2013) describes a detailed analysis of dynamic range 
in synthesis imaging and gives the results of application of this analysis to several 
large arrays. 

Obtaining the highest possible dynamic range requires attention to details that are 
specific to particular instruments. For the VLA, the following figures were quoted as 
rough guidelines for a good observation. Basic calibration results in dynamic range 
of order 1,000 : 1. After self-calibration, dynamic range up to ~ 20,000 : 1 is 
possible. After careful correction of baseline-based errors, it may be a few times 
higher. If a spectral correlator is used, which avoids errors in quadrature networks 
and also relaxes the requirement for delay accuracy, a dynamic range of ~ 200, 000 : 
1 is possible, with much care, assuming that the signal-to-noise ratio is adequate 
(Perley 1989). 


11.5 Mosaicking 


Mosaicking is a technique that allows imaging of an area of sky that is larger 
than the beam of the array elements. It becomes very important in the millimeter- 
wavelength range, where antenna beams are relatively narrow. Although radio 
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astronomy antennas for millimeter wavelengths are generally smaller in diameter 
than are antennas for centimeter wavelengths, their beamwidths are often narrower 
because the wavelengths are so much shorter. For example, the Atacama Large 
Millimeter/submillimeter Array (ALMA) can operate at frequencies up to 950 GHz, 
at which the beamwidth of the 12-m-diameter antennas could be as small as ~ 6”. 

Consider imaging a square field whose sides are n times the width of the antenna 
primary beam. One can divide the required area into n? subfields, each the size of a 
beam, and produce a separate image for each such area. The n? beam-area images 
can then be fitted together like mosaic pieces to cover the full field desired. One 
would anticipate that some difficulty might occur in obtaining uniform sensitivity, 
particularly near the joints of the mosaic pieces, but clearly the idea is feasible. From 
the sampling theorem described in Sect. 5.2, the number of visibility sample points 
in u and v required in an image covering n? beam areas is n? times as many as would 
be required in an image that covers just one beam area. In mosaicking, the increased 
data are obtained by using n? different pointing directions of the antennas. As a 
result, the sampling of the visibility in u and v must be at an interval 1/n of that for 
a field equal to the beam size, and this interval is usually less than the diameter of 
the antenna aperture. However, it is possible to determine how the visibility varies 
on a scale less than the diameter of an antenna, as discussed below. 

Figure 5.9 shows two antennas that are tracking the position of a source. The 
antenna spacing projected normal to the direction of the source is u, and the antenna 
diameter is d}, both quantities being measured in wavelengths. In the u direction, 
the interferometer responds to spatial frequencies from (u — d) to (u + dı), since 
spacings within this range can be found within the antenna apertures. Measurement 
of the variation of the visibility over this range of baselines can provide the fine 
sampling required in mosaicking. The difference in path lengths from the source to 
the two antenna apertures is w wavelengths, and as the antennas track, the variation 
in w gives rise to fringes at the correlator output. Since the apertures of the antennas 
remain normal to the direction of the source, the path difference w, and its rate 
of change, are the same for any pair of points of which one is in each aperture 
plane, regardless of their spacing. Thus, because of the tracking motion, the signals 
received at any two such points produce a component of the correlator output with 
the same fringe frequency. Such components cannot, therefore, be separated by 
Fourier analysis, and information on the variation of the visibility within the spatial 
frequency range (u — d,) to (u + d)) is lost. However, in mosaicking, the antenna 
beams are scanned across the field, either by moving periodically between different 
pointing centers or by continuously scanning, for example, in a raster pattern. The 
scanning is in addition to the usual tracking motion to follow the source across the 
sky. In Fig. 5.9, it can be seen that if the antennas are suddenly turned through a 
small angle 490, then the position of the point B is changed by Au A0 wavelengths 
in a direction parallel to that of the source. This results in a phase change of 
approximately 27 Au A@ in the fringe component corresponding to the spacing 
(u + Au), of which points A; and B are an example. Since this phase change is 
linearly proportional to Au, the variation of the visibility within the range (u — d3) 
to (u + d,) can be obtained by Fourier transformation of the correlator output with 
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respect to the pointing offset A@. Thus, the changes in pointing induce variations 
in the fringe phase that are dependent on the spacing of the incoming rays within 
the antenna apertures, and this effect allows the information on the variation of the 
visibility to be retained. 

The conclusion given above, that the scanning action of the antennas allows 
information on a range of visibility values to be retrieved, was first reached by 
Ekers and Rots (1979), using a mathematical analysis, as follows. Consider a 
pair of antennas with spacing (uo, vo) pointing in the direction (lp, mp). As the 
pointing angle is varied, the effective intensity distribution over the field of interest 
is represented by /(/, m) convolved with the normalized antenna beam Ay (l, m). The 
observed fringe visibility is the Fourier transform with respect to u and v of I(, m) 
multiplied by the antenna response for the particular pointing: 


V(uo, vo, lp, Mp) = I / An(l— p,m — mp)I(l, me PF di dm. (11.12) 
Assuming that the antenna beam is symmetrical, we can write Eq. (11.12) as 
V(uo, Vo, lp, mp) = f f An (lp — l,m, — m)I(L, me = m didm , (11.13) 


which has the form of a two-dimensional convolution: 
V (uo, vo, lp, Mp) = [I(l, me 7+0] x x Ay(l, m) . (11.14) 


Now we take the Fourier transform of V with respect to u and v, which represents 
the full-field visibility data obtained by means of the ensemble of pointing angles 
used: 


Vu, v) = J J [1(L, mje 70m] x x Ay (l, met ™ dl dm 
= [V (u, v) * x’ 8 (uo — u, vo — v)] An (u, v) . (11.15) 


Here, Ay (u, v) is the Fourier transform of Ay (l, m), that is, the autocorrelation of 
the field distribution over the aperture of a single antenna, referred to as the transfer 
function or spatial sensitivity function of the antenna. The two-dimensional delta 
function ?8 (uo — u, vo — v) is the Fourier transform of e~/?7!+%™ | As the final 
step, Eq. (11.15) becomes 


V(u, v) = V[(up — u), (vo — v)] Ay (u, v) . (11.16) 


The conclusion from Eq. (11.16) is that if one observes a field of dimensions equal 
to several beamwidths, obtains the visibility for a number of pointing directions, and 
then for each antenna pair takes the Fourier transform of the visibility with respect 
to the pointing direction, the result will be values of the visibility extended over an 
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area of the (u, v) plane as large as the support of the function Ay (u, v). For a circular 
reflector antenna of diameter d, Ay(u, v) is nonzero within a circle of diameter 2d. 
Thus, if An(u, v) is known with sufficient accuracy, that is, the beam pattern is 
sufficiently well calibrated, the visibility can be obtained at the intermediate points 
required to provide the full-field image. 

In the practical reduction of visibility data used in mosaicking, the Fourier trans- 
form with respect to pointing is usually not explicitly performed. The importance 
of the discussion above is that it shows that the information at the required spacing 
is present in the data if the antenna pointing is scanned with respect to the source, 
either as a continuous motion or as a series of discrete pointings. The reduction 
to obtain the intensity distribution is generally based on the use of nonlinear 
deconvolution algorithms. 

Cornwell (1988) has pointed out that the angular spacing required between the 
pointing centers on the sky can be deduced from the sampling theorem of Fourier 
transforms (Sect. 5.2.1). A more general form of the theorem can be stated as 
follows: If a function f(x) is nonzero only within an interval of width A in the 
x coordinate, then it is fully specified if its Fourier transform F(s) is sampled at 
intervals no greater than A~! in s. If the sampling is coarser than this, aliasing will 
occur, and the original function will not be reproducible from the samples. Here, 
we consider an antenna beam pointing toward a source that is wide enough to cover 
most of the reception pattern, that is, the main beam and major sidelobes. As we 
move the antenna beam to different pointing angles to cover the source, we are 
effectively sampling the convolution of the source and the antenna beam. The beam 
pattern is equal to the Fourier transform of the autocorrelation function of the field 
distribution over the antenna aperture. The field cuts off at the edges of the aperture, 
which is d, wavelengths wide. Thus, the autocorrelation function cuts off at a width 
2d,. The sampling theorem indicates that the interval between pointings Al, should 
not exceed 1/(2d,) in order to fully sample the source convolved with the beam. In 
practice, the antenna illumination function is likely to be tapered at the edge, so the 
autocorrelation function falls to low levels before it reaches the cutoff width 2d}. 
Thus, if Al, slightly exceeds 1 /(2d,), the error introduced may not be large. 


11.5.1 Methods of Producing the Mosaic Image 


The basic steps in the mosaicking method are: 


1. Observe the visibility function for an appropriate series of pointing centers. 

2. Reduce the data for each pointing center independently to produce a series of 
images, each covering approximately one antenna beam area. 

3. Combine the beam-area images into the required full-field image. 


In step 2, it is desirable also to deconvolve the synthesized beam response from each 
beam-area image to remove the effects of sidelobes in the response, and this can be 
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done using, for example, CLEAN or MEM. Use of these nonlinear algorithms can 
fill in some of the frequency components of the intensity that were omitted from the 
coverage of the antenna array. Cornwell (1988) and Cornwell et al. (1993) describe 
two procedures for mosaic imaging. The first of these, which they refer to as linear 
mosaicking, is essentially the three steps above with a least-mean-squares procedure 
for combination of the individual pointing images in step 3. Although a nonlinear 
deconvolution is used individually on each beam-area image, the combination of 
the images is a linear process. The second procedure, which differs in that the 
deconvolution is performed jointly, is referred to as nonlinear mosaicking and 
involves a nonlinear algorithm such as MEM. Unmeasured visibility data can 
best be estimated in the deconvolution process if the full field that is covered by 
the ensemble of pointing angles contributes simultaneously to the deconvolution, 
rather than by treating each primary beam area separately. The benefit of a joint 
deconvolution of the combined beam-area images is illustrated by consideration of 
an unresolved component of the intensity distribution located at the edge of a beam 
area, where it occurs in two or more individual beam images. Being at the beam 
edge where the response is changing rapidly, the amplitude of the component is 
more likely to be inaccurately determined, but such errors will tend to average out 
in the combined data. In the application to mosaicking, maximum entropy can be 
envisaged as the formation of an image that is consistent with all the visibility data 
for the various pointings, within the uncertainty resulting from the noise. 

Cornwell (1988) discusses use of the MEM algorithm of Cornwell and Evans 
(1985) in mosaicking. This algorithm is briefly described in Sect. 11.2.1 [see 
Eq. (11.5)]. The procedure is essentially the same as in the application to a single- 
pointing image, except for a few more steps in determining y? and its gradient. 
As in Eq. (11.4), x? is the statistic that indicates the deviation of the model from the 
measured visibility values and is here expressed as 


meas model |2 


fey yee (11.17) 
p k 


2 
Op 


where the subscripts k and p indicate the kth visibility value at the pth pointing 
position, and Oi is the variance of the visibility. An initial model is required, and 
the procedure follows a series of steps described by Cornwell (1988): 


1. For the first pointing center, multiply the current trial model with the antenna 
beam as pointed during the observation, and take the Fourier transform with 
respect to (l, m) to obtain the predicted visibility values. 

2. Subtract the measured visibilities from the model visibilities to obtain a set 
of residual visibilities. Insert the residual visibilities into the accumulating x? 
function of Eq. (11.17). 

3. By Fourier transformation, convert the residual visibilities, weighted inversely as 
their variances, into an intensity distribution. Taper this distribution by multiply- 
ing it by the antenna beam pattern, and store in a data array of dimensions equal 
to the full MEM model. 
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4. Repeat steps 1-3 for each pointing. In step 2, add the value for x? to those for 
the other pointings in this cycle. In step 3, add the residual intensity values into 
the data array. The accumulated values in this data array are used to obtain the 
gradient of x? with respect to the MEM image. 


The reason for the additional multiplication of the residual distribution by the beam 
function in step 3 is that it reduces unwanted responses from sidelobes of the 
primary beam that fall on adjacent pointing areas. It also weights the data with 
respect to the signal-to-noise ratio. Completion of the MEM procedure may require 
several tens of cycles through the steps given above to obtain convergence to the 
final image. To complete the process, smoothing with a two-dimensional Gaussian 
beam of width equal to the array resolution is recommended, to reduce the effects 
of variable resolution across the image. 

A slightly different procedure for nonlinear mosaicking is described by Sault 
et al. (1996). In this case, the beam-area images are combined linearly without the 
individual deconvolution step, and then the final nonlinear deconvolution is applied 
to the combined image. In the linear combination, each pixel in the combined 
image is a weighted sum of the corresponding pixels in the individual beam-area 
images. As an example, Sault et al. also show results for a mosaic of the Small 
Magellanic Cloud made with the compact configuration of the Australia Telescope 
using 320 pointings. They demonstrate that the joint deconvolution used in nonlinear 
mosaicking is superior to the linear combination of the subfield images, even if these 
have been individually deconvolved. They also show the deconvolution using both 
their method and that described by Cornwell (1988) and conclude that the results 
are of comparable quality. 


11.5.2 Short-Baseline Measurements 


In imaging sources wider than the antenna beam, it is important to obtain visibility 
values at increments in u and v that are smaller than the diameter of an antenna. Data 
equivalent to an essentially continuous coverage in u and v can then be obtained by 
observing at various pointing positions as discussed above. The minimum spacing of 
two antennas is limited by mechanical considerations, and there is a gap or region 
of low sensitivity corresponding to a spacing of about half the minimum spacing 
between the centers of two antenna apertures. This is called the “short-spacing 
problem.” 

This minimum spacing depends on the antenna design, but in general, unless the 
range of zenith angles is restricted, two antennas of diameter d cannot be spaced 
much closer than about 1.4d, or perhaps 1.25d with special design. Otherwise, there 
is danger of mechanical collision, especially if there is a possibility that the antennas 
may not always be pointing in the same direction. Total-power observations with a 
single antenna will, in principle, provide spacings from zero to d/A, but with some 
antennas, measurements at spatial frequencies greater than ~ 0.5d/A are unreliable 
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because the spatial sensitivity function of the antenna falls to low levels as a result 
of the tapered illumination of the reflector. Missing data at low (u, v) values result in 
broad negative sidelobes of the synthesized beam, such that the beam appears to be 
situated in a shallow bowl-shaped depression. This effect is most noticeable when 
the field to be imaged is wide enough that there are several empty (u, v) cells within 
the central area. 

The transfer function Ay (u) is the autocorrelation function of the field distribu- 
tion over the antenna aperture and depends on the particular design of the antenna, 
including the illumination pattern of the feed. The solid curve in Fig. 11.7 shows 
Ay for a uniformly illuminated circular aperture, which can be regarded as an 
ideal case. Since there is usually some tapering in the illumination of a reflector 
antenna, in practice, Ay will generally fall off somewhat more rapidly than the 
curves shown. The function Ay in Fig. 11.7 is proportional to the common area 
of two overlapping circles of diameter d, and the abscissa is the distance between 
their centers. In three dimensions, this function is sometimes referred to as the 
Chinese hat function, and its properties are discussed by Bracewell (1995). The 
dashed curves in Fig. 11.7 show the relative spatial sensitivity for an interferometer 
using two uniformly illuminated circular apertures of diameter d. Curve | is for a 
spacing of 1.4d between the centers of the apertures; curve 2 is for a spacing of 
1.25d. If both total-power and interferometer data are obtained, it can be seen that 
the minimum sensitivity occurs for spacings of approximately half of the antenna 
spacing. 

One solution to increasing the minimum sensitivity in the spatial frequency 
coverage is the addition of total-power measurements from a larger antenna [see, for 
example, Bajaja and van Albada (1979), Welch and Thornton (1985), Stanimirovié 
(2002)]. Stanimirovié considered the requirements for the use of single-antenna 
measurements of fringe visibility and concluded that the diameter of the antenna 


Fig. 11.7 The solid curve centered on the origin shows the spatial sensitivity function Ay for 
a single antenna of diameter d. The curve corresponds to the case of uniform excitation over 
the aperture. This curve indicates the relative sensitivity to spatial frequencies for total-power 
observations with a single antenna. The dashed curves show the spatial sensitivity for two antennas 
of diameter d, with uniform aperture excitation, working as an interferometer. Curve 1 is for a 
spacing of 1.4d between the centers of the antennas, and curve 2 is for a spacing of 1.25d. If the 
aperture illumination is tapered, the curves will fall off to low values more rapidly than is shown. 
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should be at least 1.5 times the spacing for which the visibility value is required. 
Note, however, that since the cost of an antenna scales approximately as d?-’ (see 
Sect. 5.7.2.2), the expected cost of an antenna of diameter 1.5d is roughly 4.4 
times that of an antenna of diameter d. The process of merging total power and 
interferometric data is sometimes called “feathering.” 

Another possibility for covering the missing spatial frequencies is the use of one 
or more pairs of smaller antennas, say, d/2 in diameter, with spacing about 0.7d. A 
pair of antennas of diameter d/2 have one-quarter of the area, and consequently one- 
quarter of the sensitivity to fine structure, of a pair of the standard antennas. Since 
the beam of the smaller antenna has four times the solid angle of a standard antenna, 
it will require one-quarter of the number of pointing directions, and the integration 
time for each one can be four times as long. Cornwell et al. (1993) present 
evidence that, for mosaicking, it is possible to obtain satisfactory performance 
with a homogeneous array, that is, one in which all antennas are the same size. 
This requires total-power observation as well as interferometry with some antennas 
spaced as closely as possible. The deconvolution steps in the data reduction help to 
fill in remaining (u, v) gaps. 

At frequencies of several hundred gigahertz, where antenna beams are of minute- 
of-arc order, images of objects of order one degree in size require numbers of 
pointings in the range 107-10*. Any given pointing cannot be quickly repeated, 
so dependence on Earth rotation to fill in small gaps in the (u, v) coverage may not 
be practicable. Thus, arrays designed for mosaicking of large objects require good 
instantaneous (u, v) coverage. At such high frequencies, it is also desirable to avoid 
high zenith angles to minimize atmospheric effects. 

An alternative to tracking discrete pointing centers is to sweep the beams over 
the area of sky under investigation in a raster scan motion. This technique has been 
referred to as “on-the-fly” mosaicking. It has several advantages: 


e The uniformity of the (u,v) coverage for all points in the field is maximized, 
which results in uniformity of the synthesized beam across the resulting image 
and thereby simplifies the image processing. 

e Each point in the field is observed many times in as rapid succession as possible, 
so some advantage can be taken of Earth rotation to fill in the (u, v) coverage. 

e If total-power measurements are made, the scanning motion of the beam can be 
used to remove atmospheric effects in a similar way to the use of beam switching 
in large single-dish telescopes. 

e Waste of observing time during moves of the antennas from one pointing center 
to another is eliminated. 


With on-the-fly observing, the real-time integration at the correlator output must be 
somewhat less than the time taken for the beam to scan over any point in the field, 
and thus a large number of visibility data, each with a separate pointing position, 
are generated. 
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11.6 Multifrequency Synthesis 


Making observations at several different radio frequencies is an effective way of 
improving the sampling of the visibility in the (u, v) plane. This technique is referred 
to as multifrequency synthesis, or bandwidth synthesis. Generally, the range of 
frequencies is about + 15% of the midrange value. Such a range can be very effective 
in filling in gaps in the coverage, and since it is not too large, major changes in 
the source structure with frequency are avoided [see, e.g., Conway et al. (1990)]. 
However, the variation of structure with frequency may be large enough to limit the 
dynamic range unless some steps are taken to mitigate it, as discussed here. The 
principal continuum radio emission mechanisms produce radio spectra that vary 
smoothly in frequency (see Fig. 1.1), and the intensity usually follows a power-law 
variation with frequency: 


I(v) = 10)(=) ; (11.18) 


where œ is the spectral index, which varies with (l, m). If the spectrum does not 
conform to a power law, then, in effect, we can write 


sini (11.19) 


If the spectral index were a constant over the source, the spectral effects could be 
removed. Although this is generally not the case, the spectral effects are reduced 
by first correcting the data for a “mean” or “representative” spectral index for the 
overall structure to be imaged. Thus, from this point, œ will represent the spectral 
index of the deviation of the intensity distribution from this first-order correction. 
Consider the case in which the intensity variation can be approximated by a linear 
term: 


IŒ) = Ivo) + Lo — vo) 


= 1o) + aro) E 


~ I(vo) + ero , (11.20) 


where the reference frequency vo is near the center of the range of frequencies 
used. Equation (11.20) is the sum of a single-frequency term and a spectral term. 
To determine the synthesized beam of an array working in the multifrequency 
mode, consider the response to a point source with a spectrum given by Eq. (11.20). 
The response to the single-frequency term can be obtained by taking the Fourier 
transform of the spatial transfer function. The transfer function has a delta function 
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of u and v for each visibility measurement. Each frequency used contributes a 
different set of delta functions. The response to the spectral term is obtained by 
multiplying the transfer function by (v — vo)/vo and taking the Fourier transform. 
If we call the single-frequency and spectral responses bj and b}, respectively, the 
synthesized beam is equal to 


bo(l, m) = b) (l, m) + a(l, m)b', (I,m) . (11.21) 


The first component is a conventional synthesized beam, and the second one is 
an unwanted artifact. The measured intensity distribution obtained as the Fourier 
transform of the measured visibilities is 


Io(l, m) = I(l, m) * * bọ(l, m) + a(l, m)I(l, m) * x bi (l, m) , (11.22) 


where I(l, m) is the true intensity on the sky. Conway et al. (1990) and Sault 
and Wieringa (1994) have both developed deconvolution processes based on the 
CLEAN algorithm that deconvolve both bj and b. In the method used by the 
first of these groups, components representing each one of the two beams were 
removed alternately. In the method used by the second group, each component 
removed represented both beams. These methods provide the distribution of both 
the source intensity and the spectral index as functions of frequency. Conway et al. 
also consider a logarithmic rather than a linear form of the frequency offsets from 
vo. These analyses show that for a frequency spread of approximately +15%, the 
magnitude of the response resulting from the b} component is typically 1% and 
can sometimes be ignored. Removing the b} component reduces the spectral effects 
to ~ 0.1%. 

For other approaches and extensions to multifrequency synthesis, see Rau and 
Cornwell (2011) and Junklewitz et al. (2015). 


11.7 Noncoplanar Baselines 


In Sect. 3.1, it was shown that, except in the case of east-west linear arrays, the 
baselines of a synthesis array do not remain in a plane as the Earth rotates. It was 
also shown that for fields of view of small angular size [as given approximately by 
Eq. (3.12)], the Fourier transform relationship between visibility and intensity can 
be expressed satisfactorily in two dimensions. However, particularly for frequencies 
less than a few hundred megahertz, the small-field assumption does not always 
apply. At meter wavelengths, the primary beams of the antennas are wide, for 
example, ~ 6° for a 25-m-diameter antenna at a wavelength of 2 m. Also, the high 
density of strong sources on the sky at meter wavelengths requires that the full beam 
be imaged to avoid confusion. We now consider the case in which the condition in 
Eq. (3.12) (@ < a) is not valid, so the two-dimensional solution should not 
be used. The following treatment follows those of Sramek and Schwab (1989) and 
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others as noted. We start with the exact result in Eq. (3.7), which is 


V(u,v,w) = I = | © Ay(I,m)I(I, m) 
eS Jda f= Pa 


x exp {—j2x [w+ vm + w (V1 Pm- 1)]} didm. — (11.23) 


Here, V(u,v,w) is the visibility as a function of spatial frequency in three 
dimensions, Ay(/,m) is the normalized primary beam pattern of an antenna, and 
I(l, m) is the two-dimensional intensity distribution to be imaged. 

The next step is to rewrite Eq. (11.23) in the form of a three-dimensional Fourier 
transform, which involves the third direction cosine n defined with respect to the 
w axis. The phase of the visibility V(u, v, w) is measured relative to the visibility 
of a (hypothetical) source at the phase reference position for the observation. This 
introduces a factor e/?”” within the exponential term on the right side of Eq. (11.23), 
as noted in the text following Eq. (3.7). The corresponding phase shift is inserted 
by the fringe rotation discussed in Sect. 6.1.6. As a result of this factor, we use 
n = n— 1 as the conjugate variable of w in order to obtain the three-dimensional 
Fourier transform. Functions of n’ will be indicated by a prime. Thus, we rewrite 
Eq. (11.23): 


© f% £? An(L mI, 
vav w) = f J / eI wa [=P =m? —n'-1) 
—oo J — 00 J—0O V1 -— Ê- m? 
x exp{—j2r (ul + vm + wn')įdldm dn . (11.24) 
The delta function 6(/1 — 2 — m? — n' — 1) is introduced to maintain the condition 
n= y 1 — Ê — m? and thereby to allow n’ to be treated as an independent variable in 
the Fourier transformation. In a practical observation, V is measured only at points 


at which the sampling function W(u, v, w) is nonzero. The Fourier transform of the 
sampled visibility defines a three-dimensional intensity function 74 as follows: 


L(l,m, n’) = 


CO [oe) [e6] , P 
J J / Wu, v, w)V (u, v, w) e7 rrmtwn) dy dy dw . (11.25) 
=00 J —00 v -00 


This is the Fourier transform of the product of the two functions W (u, v, w) and 
V(u, v,w), which by the convolution theorem is equal to the convolution of the 
Fourier transforms of the two functions. Thus, 


An(1,m)I(1, m) ô (v1 =P m-n- 1) 


Ae Wmr). 


I5(1, m, n’) = 


(11.26) 
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Fig. 11.8 (a) One hemisphere of the unit sphere in (l, m, n) coordinates. The point R is the origin 
of the (l, m, n) coordinates. O is the origin of the (/, m, n’) coordinates, which is the phase reference 
point. (b) Section through the unit sphere in the (m, n) plane. The shaded area represents the extent 
of the function J3. A source at point A would not appear, or would be greatly attenuated, in a 
two-dimensional analysis in the (/,m) plane. The width of the three-dimensional “beam” in the 
n direction should be comparable to that in / and m, since the range of the sampling function in 
w is comparable to that in u and v if the observations cover a large range in hour angle. (In the 
superficially similar case in Fig.3.5, the intensity function is not confined to the surface of the 
sphere because the measurements are all made in the w’ = 0 plane). 


Here, W (L, m,n’) is the Fourier transform of the three-dimensional sampling 
function W(u, v, w), and the triple asterisk denotes three-dimensional convolution. 
Having determined the result of the Fourier transformation, we can now replace n’ 
by (n — 1), and Eq. (11.26) becomes 


An(l, m)I(l,m) ô (v1 TP m- n) 
TE 


h(l,m,n) = Wil, m,n). (11.27) 


The expression in the braces on the right side of Eq. (11.27) is confined to the surface 
of the unit sphere n = V1 — Ê — m?, since the delta function is nonzero only on the 
sphere. The function W with which it is convolved is the Fourier transform of the 
sampling function and is, in effect, a three-dimensional dirty beam. The convolution 
has the effect of spreading the expression so that J; has finite extent in the radial 
direction of the sphere. Figure 1 1.8a shows the unit sphere centered on the origin of 
(l,m, n) coordinates at R. The (l, m) plane in which the results of the conventional 
two-dimensional analysis lie is tangent to the unit sphere at O, at which pointn = 1 
and n’ = 0. Note that since l, m, and n are direction cosines, the unit sphere in 
(L, m,n) is a mathematical concept, not a sphere in real space. 

Several ways of obtaining an undistorted wide-field image are possible (Cornwell 
and Perley 1992). 


1. Three-Dimensional Transformation. I(l, m,n) can be deconvolved by means of 
a three-dimensional extension of the CLEAN algorithm. This is complicated 
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by the fact that the visibility is, in practice, not as well sampled in w as it is 
in u and v. From Fig. 3.4, the large values of w occur for large zenith angles 
of the target source. In Fig. 11.8b, the width of the angular field is 0p. The 
transform must be computed over the range of (l,m) within this field, and over 
the range PQ in n. Cornwell and Perley (1992) suggest using a direct (rather than 
discrete) Fourier transform in the n to w transformation, since otherwise the poor 
sampling may result in serious sidelobes and aliasing. Thus, two-dimensional 
FFTs are performed in a series of planes normal to the n axis. The number of 
planes required is equal to PQ divided by the required sampling interval in the 
n direction. The range of measured visibility values has a width 2|w|max in the w 
direction, so, by the sampling theorem, the intensity function is fully specified in 
the n coordinate if it is sampled at intervals of (2|w|max) ~}. The distance PQ is 
approximately equal to o ~ $|P + m|max [note that the angle POQ = 6;/4, 
and (0/2)? = |? + m”|max]. Thus, the number of planes in which the two- 
dimensional intensity must be calculated is |? + m?|max |W|max- [This result can 
also be obtained by taking the phase term in Eq. (3.8) that is omitted in going 
from three to two dimensions and sampling at the Nyquist interval of half a turn 
of phase.] The maximum possible value of w is Dax /A, where Dmax is the longest 
baseline in the array. If 6 is limited by the beamwidth of antennas of diameter 
d, for which the angular distance from the beam center to the first null is ~ A/d, 
the required number of planes is ~ (A/d)? x Dmax/A = ADmax/d*. Examples of 
images made using this method are given by Cornwell and Perley (1992). 

2. Polyhedron Imaging. The area of the unit sphere for which the image is required 
can be divided into a number of subfields, which can be imaged individually 
using the small-field approximation. Each one is imaged in two dimensions onto 
a plane that is tangent to the unit sphere at a different point on the sphere. 
These tangent points are the phase centers for the individual subfields. For each 
subfield image, it is necessary to adjust both the visibility phases and the (u, v, w) 
coordinates of the whole database to the particular phase center. The subfields 
can be combined using methods similar to those used in mosaicking, including 
joint deconvolution. This approach has been referred to as polyhedron imaging 
because the various image planes form part of the surface of a polyhedron. Again, 
examples are given by Cornwell and Perley (1992). 

3. Combination of Snapshots. In most synthesis arrays, the antennas are mounted 
on an area of approximately level ground and thus lie close to a plane at 
any given instant. In such cases, a long observation can be divided into a 
series of “snapshots,” for each of which the planar baseline condition applies 
individually. It should therefore be possible to make an image by combining 
a series of snapshot responses. Each snapshot represents the true intensity 
distribution convolved with a different dirty beam, since the (u,v) coverage 
changes progressively as the source moves across the sky. Ideally, deconvolution 
would thus require optimization of the intensity distribution using the snapshot 
responses in a combined manner rather than individually. It should be noted that 
the plane in which the baselines lie for any snapshot is, in general, not normal to 
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the direction of the target source. As a result, the angle at which points on the unit 
sphere in Fig. 1 1.8a are projected onto the (l, m) plane is not parallel to the n axis 
and varies with the position of the source on the sky. Positions of sources in the 
snapshot images suffer an offset in (/, m) that is zero at the phase center but that 
increases with distance from the phase center. Images should be corrected for this 
effect before being combined. Since the required correction varies with the hour 
angle of the source, in long observations, the effect can cause smearing of source 
details in the outer part of the field. Perley (1999b) discusses this effect and its 
correction. Bracewell (1984) has discussed a method similar to the combination 
of snapshots described above. 

4. Deconvolution with Variable Point-Source Response. In some cases, the effect 
of two-dimensional Fourier transformation is principally distortion of the point- 
source response in the outer parts of the field, without serious attenuation 
of the response. Then a possible procedure is deconvolution using a point- 
source response (the dirty beam) that is varied over the field to match the 
calculated response (McClean (1984)). This approach was used by Waldram 
and McGilchrist (1990) in analysis of a survey using the Cambridge Low- 
Frequency Synthesis Telescope, which operated at 151 MHz using Earth rotation 
and baselines that are offset from east-west by 3°. Point-source responses 
were computed for a grid of positions within the field, and the response for 
any particular position could then be obtained by interpolation. The principal 
requirement was to obtain accurate positions and flux densities for sources 
identified in images obtained by two-dimensional transformation. Fitting the 
appropriate theoretical beam response for each source position allowed distortion 
of the beam, including any position offset, to be accounted for. The procedure 
was relatively inexpensive in computer time. 

5. W-Projection. W-projection (Cornwell et al. 2008) is a more efficient method of 
handling the problem of noncoplanar baselines. This problem occurs when the 
width of the synthesized field is sufficiently large that the w term in the exact 
visibility equation [(3.7) and (11.23)] cannot be neglected. In w-projection, we 
start by rewriting the visibility equation, Eq. (11.23), as 


œ fœ Ay (I, m)I(I, 
Vu, v,w) = J / NE IU m) Gn, w) eri] at de , 
—oo J—00 


V1—P —m? 
(11.28) 
where 
Gil, m, w) = e ITN 1—1?—m2—1) : a 1.29) 


so that the w dependence is contained within G(/, m, w), and the other parts of 
Eq. (11.28) represent V(u, v, w = 0). If G(u, v, w) is the Fourier transform of 
G(l, m, w) with respect to (u, v) and (l, m), Eq. (11.28) can be written as a two- 
dimensional convolution in (u, v), 


V(u,v,w) = V(u, v, w = 0) x x G(u, v,w) . (11.30) 
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Again, we can visualize (u, v, w) space with the u and v axes in a horizontal plane 
with w increasing vertically upward. The measured visibility values are located 
within a block of (u,v,w) space of dimensions limited by the longest antenna 
spacings and the geometry of the observations. Generally, the observations are 
designed to optimize the uniformity of sampling of the visibility in the u and v 
dimensions, but the sampling in w is usually relatively sparse. The procedure in 
w-projection is to project the three-dimensional visibility data onto the (u, v, w = 0) 
plane, from which a two-dimensional Fourier transform provides an image in / and 
m. The (u, v, w = 0) plane is parallel to the tangent plane on the celestial sphere at 
the field center and thus represents data for which the ray paths from a source at the 
field center to the corresponding pair of antennas are of equal length. Data for which 
w is nonzero are those for which the ray paths differ in length by w wavelengths. 
To use such data to obtain visibility in the (u,v,w = 0) plane, it is necessary to 
account for the additional path length to one antenna of each pair. In propagating 
the extra distance in space, the radiation from a point is spread by diffraction, so a 
single (u, v, w) point is spread into a diffraction pattern at w = 0. This spread of the 
pattern results from the width of the convolution function G(u, v, w) in Eq. (11.30) 
and is approximately proportional to |w]. 

If we use the approximation V1 — P —m? ~ 1 — (Ê + m?)/2, Eq.(11.29) 
becomes 


G(l,m, w) x esr +n) (11.31) 


Fourier transformation then gives 


Glu, v, w) x Leite +e?)/w (11.32) 
w 


The visibility V(u, v, w) is entirely determined by V(u,v,w = 0) through 
convolution with G. Thus, V(u,v,w = 0) contains all the data that are required 
to provide an accurate image, limited only by the synthesized (dirty) beam. Nothing 
essential to the image is lost in the transition from three dimensions to two. The 
same convolution function G applies to projection in both directions, i.e., from 
(u,v,w = 0) to (u,v,w) and vice versa. Note that the convolving function is 
different for each (u, v, w) data point. Cornwell et al. (2008) point out that this 
convolutional relationship between a two-dimensional and a three-dimensional one 
is due to the fact that the original brightness is confined to the two-dimensional 
surface of the celestial sphere. They also discuss the result in terms of the diffraction 
of the electric field over the w-coordinate space. 

The w-projection imaging procedure is as follows. First, the visibility data 
are gridded in (u, v,w) and then projected onto the (u,v,w = 0) plane. In 
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the projection, the data are spread in (u,v) space by the convolution,' and thus 
regridding in the (u,v,w = 0) plane is required. A two-dimensional Fourier 
transform then provides the dirty image, from which the dirty beam must then be 
deconvolved by the CLEAN algorithm or some alternate procedure (see Sect. 11.1). 
CLEAN requires numerous transpositions of data between the visibility and image 
domains. In going from the model image to visibility, a two-dimensional transform 
provides V(u,v,w = 0), from which projection gives values at (u,v, w) points 
required for comparison with the observations. For the regridding steps, convolution 
with a spheroidal or other gridding function is required. Since convolution is 
commutative and associative, it can be computationally efficient to convolve the 
spheroidal function with the projection function G and thus store the combined 
convolution functions for use with each (u, v, w) grid point. Convolution of G with 
the spheroidal function has the additional benefit of damping the behavior of G as 
w—> 0. 

Cornwell et al. (2008) also provide details of a simulated example of wide-field 
imaging using w-projection. They compare the results with the method of image- 
plane facets (similar to polyhedron imaging), and also wuw-space facets (similar to 
mosaicking), which projects the (u, v) space, rather than image space, onto tangent 
plane facets. Hitherto, the facets methods has been perhaps the most widely used 
procedure for wide-field imaging. Cornwell et al. conclude that, with regard to 
computing load, the facets method is roughly competitive with w-projection for 
images of low dynamic range but that w-projection is superior when high sensitivity 
and dynamic range are required. 

A variation of w-projection imaging, which is computationally less expensive, is 
called w-stacking (Offringa et al. 2014). 


11.8 Some Special Techniques of Image Analysis 


11.8.1 Use of CLEAN and Self-Calibration 
with Spectral Line Data 


A procedure that has been found to provide accurate separation of the contin- 
uum from the line features involves use of the deconvolving algorithm CLEAN 
(van Gorkom and Ekers 1989). However, if CLEAN is applied individually to the 
images for the different channels, errors in the CLEAN process appear as differences 
from channel to channel and may be confused with true spectral features. Such 
errors can be avoided by subtracting the continuum before applying CLEAN to the 
line data. First, CLEAN is applied to an average of the continuum-only channels, 


'Cornwell et al. (2008) point out that, interestingly, this spreading in (u, v) space shows that, in 
general, a single antenna pair responds to a range of spatial frequencies, except when w = 0 (i.e., 
when the baseline is normal to the direction of incidence of the wavefront.) 
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and the visibility components removed from these channels are also removed from 
the visibility data for the channels containing line features. When the CLEAN 
process is terminated, the residuals are also removed from the line data. The 
resulting line-channel images, which should then contain only line data, can be 
deconvolved individually. Note that since absorption of the continuum may occur in 
the line frequency channels, images of line-minus-continuum may contain negative 
as well as positive intensity features. Thus, algorithms such as MEM that depend on 
positivity of the intensity may not be easily applicable in such cases. 

In applying self-calibration to eliminate phase errors in spectral line data, it can 
generally be assumed that phase and amplitude differences between channels vary 
only very little with time and are removed by the bandpass calibration. This is true 
for both atmospheric and instrumental effects. Thus, the strongest spectral feature in 
the field under investigation can be used to determine the phase-calibration solution, 
which is then applied to all channels. 


11.8.2 A-Projection 


Observations of the most distant Universe, which require removal of the emissions 
from the foreground, require observations at the highest precision and correspond- 
ingly accurate calibration of instrumental effects. Calibration of the responses of 
the individual antennas includes correcting for DD gains in the deconvolution of 
images, as discussed by Bhatnagar et al. (2008), Smirnov (201 1a,b), and others. 
DD gains? include instrumental and atmospheric effects that affect the pointing and 
polarization of the antenna responses. Correction for DD effects includes taking 
account of the rotation of the antennas relative to the sky that results from altazimuth 
tracking. The DD effects for each antenna can be represented in a 2 x 2 Jones matrix, 
a separate one for each pixel of the image. For each cross-correlated antenna pair, the 
signal product is represented by the outer product of the two Jones matrices, which 
provides a 4x 4 Mueller matrix for each pixel. The diagonal elements of the Mueller 
matrix represent the four principal products of the two cross-polarization terms of 
each antenna, for either linear or circular polarization. The off-diagonal terms are 
small and result from errors in the cross-polarization adjustment and from leakage. 
These terms must be included if accuracy better than ~ 1% is required in the image. 
This procedure has been referred to as the narrowband A-projection algorithm, in 
which A refers to the elements of A; j that are the complex convolution of the aperture 
illumination patterns of antennas i and j. The details of the cross products depend 
upon the details of the particular array, and Bhatnagar et al. (2008) consider the case 
for the VLA, in which shading by the feed legs is one of the factors represented 
by the off-axis terms. The derivation of the image from observations involves an 


?Direction-independent gains include, for example, the gain of the receiver system and are 
generally much simpler to correct for. 
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iterative y? minimalization in which y represents the difference between observed 
visibility and the visibility of a model that is developed. Calculation of gradients of 
X? can be used as an aid in the minimalization. 

Bhatnagar et al. (2013) expand these concepts to cover a wider frequency 
bandwidth and develop an A-projection algorithm that includes variation of the 
model parameters as a function of frequency. A wide bandwidth ratio (e.g., 2 : 1) 
improves the sensitivity of the observations but requires careful consideration since 
in the outer parts of the beam, the response varies rapidly with frequency. On 
wideband, wide-field imaging, see also Rau and Cornwell (2011). An extension 
of the A-projection technique called fast holographic deconvolution, which is 
particularly useful for very wide field-of-view observations, has been developed by 
Sullivan et al. (2012). 


11.8.3 Peeling 


Synchrotron emission from radio sources usually becomes stronger as the frequency 
is reduced, and hence, the density of strong sources on the sky generally increases 
with decreasing frequency. At low frequencies, it is therefore often important to 
image the whole antenna beam to avoid source confusion resulting from aliasing. 
Also, the gain of the main beam of a reflector antenna decreases with decreasing 
frequency, and if phased arrays of dipoles are used, they have to be very large to 
maintain high gain. As a result, sources in the sidelobes may not be as effectively 
suppressed relative to a source in the main beam, as is possible at higher frequencies. 
In the data analysis, unwanted responses from strong sources with known positions 
can be subtracted. In this process, known as peeling (Noordam 2004; van der Tol 
et al. 2007), the response to such sources, down to the lowest calibrated level of 
the sidelobes, is removed. This usually starts with the strongest source in the field 
and then the second strongest, and so on. The removal can be done in the visibility 
domain. A procedure of this type is essential in the measurement of the weakest 
sources and the Epoch of Reionization signatures (see Sect. 10.7.2). Some further 
discussion of peeling can be found in Bhatnagar et al. (2008), Mitchell et al. (2008), 
and Bernardi et al. (2011). 


11.8.4 Low-Frequency Imaging 


In addition to source confusion, a complication of the wide-field imaging is the 
variation of ionospheric effects over the field of view (see Sect. 14.1). The excess 
path length in the ionosphere is proportional to v~, so the resulting phase change is 
proportional to v~!. The term isoplanatic patch is used to denote an area of the 
sky over which the variation in the path length for an incoming wave is small 
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compared with the observing wavelength. At centimeter and shorter wavelengths, 
the beams of reflector antennas used in synthesis arrays are generally smaller than 
the isoplanatic patch (see Table 14.1). Thus, the effect of an irregularity in the 
ionosphere (or troposphere) is constant over the beam and can be corrected by a 
single phase adjustment for each antenna, for example, by self-calibration. However, 
at meter wavelengths, the size of the antenna beam may be several times that of 
the ionospheric isoplanatic patch. In observations with the VLA in New Mexico, 
Erickson (1999) estimated that the size of the isoplanatic patch at 74 MHz is ~ 3°- 
4°, whereas the beamwidth of a 25-m-diameter antenna at the same frequency is 
~ 13°. Later, low-frequency instruments have used phased arrays, which enable 
much smaller beams to be formed. These include LOFAR, covering 15-80 MHz 
and 110-240 MHz (de Vos et al. 2009); the Murchison Widefield Array, covering 
80-300 MHz (Lonsdale et al. 2009); and the Long Wavelength Array, covering 10- 
88 MHz (Ellingson et al. 2009). 

Although, at meter wavelengths, arrays of dipoles or similar elements are more 
generally used than parabolic reflectors, some early measurements using the 25-m- 
diameter antennas of the VLA by Kassim et al. (1993) are of interest. These include 
simultaneous measurements of a number of strong sources at 74 and 330 MHz, 
using a phase reference procedure to calibrate the phases at the lower frequency. 
At 74 MHz, the phase fluctuations are dominated by the ionosphere, and rates of 
phase change were found to be as high as one degree per second. These precluded 
calibration by the usual methods. However, at 330 MHz, the rates of phase change 
were slow enough to allow imaging of strong sources. The resulting 330-MHz 
phases were scaled to 74 MHz and used to remove the ionospheric component 
from the 74-MHz data that were recorded simultaneously. The procedure used for 
obtaining images at 74 MHz was essentially as follows: 


1. Simultaneous observations of a strong source were made at 74 and 330 MHz, 
with periodic observations of a calibrator at 330 MHz. 

2. An image of the target source was made at 330 MHz using the standard 
techniques (i.e., use of a calibrator as at centimeter wavelengths). This was used 
as a starting model for self-calibration of the 330-MHz data. 

3. The self-calibration provided phase calibration for each antenna at 330 MHz. 
These values were then scaled to 74 MHz and used to remove the ionospheric 
variations from the 74 MHz data, the ionospheric phase changes being inversely 
proportional to frequency. 

4. The instrumental phases at 330 and 74 MHz were different at each antenna 
as a result of dissimilar cable lengths, etc. To calibrate these differences, an 
unresolved calibrator was observed at both 330 and 74 MHz. The ionospheric 
variations could be removed from the 74-MHz calibrator phases using the scheme 
in step 3. 

5. The 74-MHz image of the target source was made from the calibrated phase data. 
Self-calibration of the 74-MHz data was used to remove residual phase drifts, and 
for this, the 330-MHz image provided a suitable starting model. 
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For the strongest sources, for which it was possible to obtain a good signal-to- 
noise ratio in an averaging time of no more than 10 s, self-calibration at 74 MHz 
was sufficient in most cases. Although only eight VLA antennas were equipped 
for operation at 74 MHz, images with dynamic range of better than 20 dB were 
obtained for several sources. The problem of noncoplanar baselines did not arise 
in these measurements because the sources were compact enough for satisfactory 
two-dimensional imaging. 


11.8.5 Lensclean 


There are many cases in which the image of a quasar or radio galaxy is distorted 
by the gravitational field of a galaxy, following the discovery of this phenomenon 
by Walsh et al. (1979). The line of sight from the lens source intersects, or 
passes very close to, the galaxy. In some cases, the gravitational lensing results 
in multiple images of a single point-source quasar, and in other cases, extended 
structure is involved: see, for example, Narayan and Wallington (1992). In studies of 
gravitational lensing, the structure of the gravitational field is of major astrophysical 
importance. The term /ensclean has been used to denote a method of analysis, 
including several variations of the original algorithm, that allows the lensing field 
to be determined by synthesis imaging. The basis of these methods is analogous to 
self-calibration, in which the image is sufficiently overdetermined by the visibility 
measurements that it is possible to determine also the complex gains of the antennas. 
In lensclean, it is the pattern of the gravitational field that is determined. An 
additional complication is that points in the source of the radiation can each 
contribute to more than one point in the synthesized image. 

The original lensclean procedure (Kochanek and Narayan 1992) is based on 
an adaptation of the CLEAN algorithm. The basic principle can be described as 
follows. Consider the case in which the source that is imaged by the lens contains 
extended structure. An initial model for the lens is chosen. Each point in the source 
contributes to multiple points in the image, and this procedure from the source to 
the image is defined by the lens model. For any point in the source, the intensity 
in the image should ideally be the same at each point at which it appears, since the 
imaging involves only geometric bending of the radiation from the source, as in an 
optical system. Suppose that the jth source pixel contributes to n; image pixels. In 
practice, the intensity of these pixels in the image is not equal because of defects in 
the lens model and noise in the image. The best estimate of the intensity of the pixel 
in the source is the mean intensity of the corresponding pixels in the image. Thus, 
one can subtract components from the image in the manner of CLEAN and build up 
an image of the source. For each source pixel for which n; > 1, the mean-squared 
deviation of the intensity of the corresponding image pixels from the mean intensity 
of the n; > 1 image pixels, of, is calculated. For a good lens model, the mean value 


of o? over the pixels in the source should be no greater than the variance of the noise 
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in the image, Os If the number of degrees of freedom in the source image is taken 
to be equal to the number of pixels, then the statistical measure of the quality of the 
lens model is y? = L (o? / OZ ise) where the sum is taken over the j source pixels. 
The lens parameters can thus be varied to minimize x°. In practice, the procedure 
is more complicated than indicated by the description above. Modifications are 
included to take account of the finite resolution of the image, which has the effect 
of spreading the contribution of each source pixel over a number of image pixels. 
Also, for any unresolved structure in the source, the intensity of the corresponding 
structure in the image depends on the magnification of the lens. 

Ellithorpe et al. (1996) introduced a visibility lensclean procedure in which the 
CLEAN components are removed from the ungridded visibility values under the 
constraints of a lens model. The squared deviations of the measured visibility from 
a model are used to determine a y? statistic. The quality of the fit is judged from the 
variance of the measured visibility, and the number of degrees of freedom is 2Nyis — 
3Nore — Niens. Here, Nyis is the number of visibility measurements (which each have 
two degrees of freedom), Nec is the number of independent CLEAN components in 
the source model (three degrees of freedom, from position and amplitude), and Mens 
is the number of parameters in the lens model. Ellithorpe et al. compared results of 
the original lensclean with visibility lensclean and found the best results from the 
latter, with a further improvement if a self-calibration step is added. The use of the 
MEM algorithm as an alternative to CLEAN has also been investigated (Wallington 
et al. 1994). 


11.8.6 Compressed Sensing 


Compressed sensing, also known as compressive sensing, compressive sampling, or 
sparse sampling, is a widely used signal processing technique generally employed 
to reduce the size of data sets, e.g., images, without loss of information. Sampling 
at the Nyquist interval provides the most general and complete representation of 
an image. However, if an image is sparse, i.e., it is mostly blank with isolated 
components or can be represented by a small number of basis functions such 
as wavelets, then it is possible to compress or reduce the image size far below 
that required for Nyquist sampling. The theory of compressed sensing has formal 
requirements such as sparsity and incoherent sampling. The latter requirement in 
interferometric imaging corresponds to random sampling in the (u, v) plane. Under 
such conditions, an image can be derived exactly with very high probability from 
a sparse set of visibility measurements. These conditions are not perfectly met in 
radio interferometry. Nonetheless, much can be learned from compressed sensing 
techniques [see Li et al. (201 1a,b)]. 

In the application to interferometric imaging, the method is formulated in a way 
to obtain an accurate image from an incompletely sampled (u, v) plane data set. 
The degree of success of the method depends on the signal-to-noise ratio in the 
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(u, v) plane and the amount of information that can be supplied to constrain the 
image solution while being consistent with the (u,v) plane measurements. The 
simplest of these constraints are nonnegativity, compactness, and smoothness in 
the image plane. In other words, the method improves as the amount of a priori 
information available increases. For applications to radio interferometric data and 
specific algorithms, see, for example, Wiaux et al. (2009), Wenger et al. (2010), 
Li et al. (2011), Hardy (2013), Carrillo et al. (2012), Garsden et al. (2015), and 
Dabbech et al. (2015). For a general introduction to the field of compressed sensing 
as a signal-processing tool, see Candés and Wakin (2008), and for its application to 
image construction, see Candés et al. (2006a,b). Compressed sensing is widely used 
in medical imaging [e.g., Lustig et al. (2008)]. 

For a simplified overview of the method and some of its key concepts, consider 
a linear vector equation that can be expressed as 


V=AX, (11.33) 


where VY represents visibility, X represents brightness in the image, and A is the 
operator that derives the visibility from the parameters of the image, i.e., the Fourier 
transform kernels. 

Important quantities in the image-restoration process are the L, norms defined as 


N 1/p 
Ly = IX|lp = bs xr l p>0, (11.34) 


n=1 


where X, are the elements of X. For p = 0, a pseudo norm Lọ can be written as 


N 
Lo = [Xlo = $ Xl (11.35) 


n=1 


with the understanding that 0° = 0. In our context, Lo is the number of cells in 
the image with nonzero amplitude. Suppose that a point source is observed. In the 
absence of measurement noise, the normalized moduli of the visibilities will all 
be unity. Minimizing Lo with the image constraint of Eq. (11.33) will lead to the 
recovery of a source distribution of a delta function. Note that the principal image 
solution is proportional to the dirty image. Lo minimization tends to remove its high 
sidelobe response. 

Unfortunately, exploration of the Lo norm is computationally very intensive. In 
two of the foundational papers in compressed sensing, Candés and Tao (2006) and 
Donoho (2006) showed that under fairly general conditions, the L; norm, defined as 


N 
Ly = |X = $ Xl; (11.36) 


n=1 
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is a suitable proxy for the Lo norm, and it is more easily computed. The Lı norm 
is the total flux density for sources of positive brightness. Virtually all work in 
compressed sensing is based on Lı minimization. In the interferometry case, the 
solution process can be described as 


minimize |X| , subject to |V — AX|| < € , (11.37) 


where e is the noise threshold based on the measurements, and |V — AX||5 is 
the squared sum of the visibility residuals or the goodness of fit. An equivalent 
optimization can be written as 


minimize fiv - axi? $ AIXI} l (11.38) 


where A is a regularization parameter that determines the relative importance of 
minimizing Lı and the measurement residuals. In statistics, this approach is called 
the least absolute shrinkage selection operator, or LASSO, developed by Tibshirani 
(1996). 

Another commonly used constraint is based on total variation (TV), often 
computed for two-dimensional images as 


1/2 
TV = > LZ X) + (Xi j+ -x)| : (11.39) 
ij 


TV is also known as the Lı norm for adjacent pixel differences. Minimizing TV 
minimizes the gradients and favors smoother images. TV minimization can be 
added to Eq. (11.38) with another A term. Note that MEM imposes a similar 
smoothness constraint (see Sect. 11.2.1). A nonnegative constraint can also be 
added. Collectively, the application of these constraints is known as regularization. 

The potential for recovering source structure finer than the diffraction limit has 
been investigated by Honma et al. (2014). An example of this application is shown 
in Fig. 11.9. The possibilities of successful superresolution improve as the image 
plane becomes more sparse, i.e., a “near black” image [see Starck et al. (2002)]. 

Another variation on the above approach, more useful for extended source 
distributions, is to represent the image by a set of basis functions such as wavelets 
[see Starck and Murtagh (1994) and Starck et al. (1994)]. If the representation is 
sparse in such a basis space, Lı minimization gives good results [e.g., Li et al. 
(201 1a,b) and Garsden et al. (2015)]. 

The most efficient and reliable methods of producing images from visibility data 
are the subject of continuing development. As with the advent of MEM methods, it 
is often desirable for researchers to present multiple reconstructions of images that 
are consistent with their (u, v) plane measurements. 
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Fig. 11.9 Reconstructed images of a simulated black hole shadow source observed with a six- 
element Event Horizon Telescope Array at the position of M87. (left) Simulated image, (middle) 
CLEANed image with resolution equal to that of the dirty beam, and (right) image reconstructed 
with compressed sensing regularization methods. It may be difficult to apply techniques tested on 
simulated data. From M. Honma et al. (2014), by permission of and © Oxford University Press. 
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Chapter 12 
Interferometer Techniques for Astrometry 
and Geodesy 


This chapter is concerned with the techniques by which angular positions of radio 
sources can be measured with the greatest possible accuracy, and with the design 
of interferometers for optimum determination of source-position, baseline, and 
geodetic! parameters. 

The total fringe phase of an interferometer, where the effect of delay tracking is 
removed, can be expressed in terms of the scalar product of the baseline and source- 
position vectors D and s, respectively, as 


ja bas Dea (12.1) 
= —D-s= —D cos , : 
x Fy 


where @ is the angle between D and s. Up to this point, we have assumed 
that these factors are describable by constants that can be specified with high 
accuracy. However, the measurement of source positions to an accuracy better than 
a milliarcsecond (mas) requires, for example, that variation in the Earth’s rotation 
vector be taken into account. The baseline accuracy is comparable to that at which 
variation in the antenna positions resulting from crustal motions of the Earth can be 
detected. The calibration of the baseline and the measurement of source positions 
can be accomplished in a single observing period of one or more days. Geodetic 
data are obtained from repetition of this procedure over intervals of months or years, 
which reveals the variation in the baseline and Earth-rotation parameters. 


lFor simplicity, we use the term geodetic to include geodynamic and static phenomena regarding 
the shape and orientation of the Earth. 
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The redefinition of the meter from a fundamental to a derived quantity has an 
interesting implication for the units of baseline length derived from interferometric 
data. An interferometer measures the relative time of arrival of the signal wavefront 
at the two antennas, that is, the geometric delay. Baselines determined from 
interferometric data are therefore in units of light travel time. Conversion to meters 
formerly depended on the value chosen for c. However, in 1983, the Conférence 
Générale des Poids et Mesures adopted a new definition of the meter: “the meter 
is the length of the path traveled by light in vacuum during a time interval of 
1/299,792,458 of a second.” The second and the speed of light are now primary 
quantities, and the meter is a derived quantity. Thus, baseline lengths can be given 
unambiguously in meters. Issues related to fundamental units are discussed by 
Petley (1983). 


12.1 Requirements for Astrometry 


We begin with a heuristic discussion of how baseline and source-position parameters 
may be determined. A more formal discussion is given in Sect. 12.2. 

The phase of the fringe pattern for a tracking interferometer [Eq. (12.1)] can be 
expressed in polar coordinates (see Fig. 4.2) as 


$(H) = 27D, [sind sin ô + cos d cos ô cos(H — h)| + din , (12.2) 


where Dy, is the length of the baseline in wavelengths, H and 6 are the hour angle 
and declination of the source, h and d are the hour angle and declination of the 
baseline, and ģin is an instrumental phase term. We assume for the purpose of this 
discussion that ¢j, is a fixed constant, unaffected by the atmosphere and electronic 
drifts. The hour angle is related to the right ascension a by 


H=t,-a, (12.3) 


where t, is the sidereal time (in VLBI, t, and H are referred to the Greenwich 
meridian, whereas in connected interferometry, they are often referred to the local 
meridian). Consider a short-baseline interferometer that is laid out in exactly the 
east-west direction, i.e., with d = 0, h = 5 [see Ryle and Elsmore (1973)]. Then 


ġ(H) = —2x D; cos ô sin H + din , (12.4) 


and the phase goes through one sinusoidal oscillation in a sidereal day. Suppose 
that the source is circumpolar, i.e., above the horizon for 24 h. From continuous 
observation of ġ over a full day, the 27 crossings of ¢ can be tracked so that there 
are no phase ambiguities. The average value of the geometric term of Eq. (12.4) is 
zero, so that di, can be estimated and removed. When the source transits the local 
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meridian, H = 0, the corrected phase will be zero, and the right ascension a can be 
determined for the time of this transit and Eq. (12.3). 

The length of the baseline can be determined by observing sources close to the 
celestial equator, || ~ 0, where the dependence of phase on 6 is small. With this 
calibration of D, and @in, the positions of other sources can be determined, i.e., their 
right ascensions from the transit times of the central fringe and their declinations 
from the diurnal amplitude of ¢(H). The source declination can also be found from 
the rate of change of phase at H = 0. This rate of change of phase is 


ad = 2m Dwe COS Ô , (12.5) 
=0 


where we = dH /dt, the rotation rate of the Earth. From Eq. (12.5), it is clear that if 
the error in (dø /dt)y=0 = of, then the error in position will be 


1 


os > ———  . 
8 27D ,@- sind f 


(12.6) 


Note that accuracy of the derived declinations is poor for sources near the celestial 
equator. An informative review of the application of these techniques is given by 
Smith (1952). 

In the determination of right ascension, interferometer observations provide 
relative measurements, that is, the differences in right ascension among different 
sources. The zero of right ascension is defined as the great circle through the pole 
and through the intersection of the celestial equator and the ecliptic at the vernal 
equinox at a specific epoch. The vernal equinox is the point at which the apparent 
position of the Sun moves from the Southern to the Northern celestial hemisphere. 
This direction can be located in terms of the motions of the planets, which are well- 
defined objects for optical observations. It has been related to the positions of bright 
stars that provide a reference system for optical measurements of celestial position. 
Relating the radio measurements to the zero of right ascension is less easy, since 
solar system objects are generally weak or do not contain sharp enough features in 
their radio structure. In the 1970s, results were obtained from the lunar occultation 
of the source 3C273B (Hazard et al. 1971) and from measurements of the weak 
radio emission from nearby stars such as Algol (6 Persei) (Ryle and Elsmore 1973; 
Elsmore and Ryle 1976). 

In the reduction of interferometer measurements in astrometry, the visibility 
data are interpreted basically in terms of the positions of point sources. The data 
processing is equivalent, in effect, to model fitting using delta-function intensity 
components, the visibility function for which has been discussed in Sect. 4.4. The 
essential position data are determined from the calibrated visibility phase or, in 
some VLBI observations, from the geometric delay as measured by maximization 
of the cross-correlation of the signals (i.e., the use of the bandwidth pattern) and 
from the fringe frequency. Because the position information is contained in the 
visibility phase, measurements of closure phase discussed in Sect. 10.3 are of use in 
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astrometry and geodesy only insofar as they can provide a means of correcting for 
the effects of source structure. Uniformity of (u, v) coverage is less important than 
in imaging because high dynamic range is generally not needed. Determination of 
the position of an unresolved source depends on interferometry with precise phase 
calibration and a sufficient number of baselines to avoid ambiguity in the position. 


12.1.1 Reference Frames 


A reference frame based on the positions of distant extragalactic objects can be 
expected to show greater temporal stability than a frame based on stellar positions 
and to approach more closely the conditions of an inertial frame. An inertial frame 
is one that is at rest or in uniform motion with respect to absolute space and not 
in a state of acceleration or rotation [see, e.g., Mueller (1981)]. Newton’s first law 
holds in such a frame. A detailed description of astronomical reference frames is 
given by Johnston and de Vegt (1999). The International Celestial Reference System 
(ICRS) adopted by the International Astronomical Union (IAU) specifies the zero 
points and directions of the axes of the coordinate system for celestial positions. The 
measured positions of a set of reference objects in the coordinates of the reference 
system provide the International Celestial Reference Frame (ICRF). Thus, the frame 
provides the reference points with respect to which positions of other objects are 
measured within the coordinate system. 

The most accurate measurements of celestial positions are those of selected 
extragalactic sources observed by VLBI. Large databases of such high-resolution 
observations exist as a result of measurements made for purposes of geodesy 
and astrometry. These measurements have been made in a systematic way mainly 
since 1979, using VLBI systems with dual frequencies of 2.3 and 8.4GHz to 
allow calibration of ionospheric effects. The positions are determined mainly by 
the 8.4GHz data. The first catalog of source positions, now called ICRF1 (Ma 
et al. 1998), was adopted by the IAU in 1998. This frame supersedes earlier ones 
based on optical positions of stars, most recently those of the FK5 and Hipparcos 
catalogs. The ICRF1 was based on 1.6 x 10° measurements of group delay obtained 
between 1979 and 1995 of 608 sources. Criteria for exclusion of a source included 
inconsistency in the position measurements, evidence of motion, or presence of 
extended structure. In this study, 212 sources were found that passed all tests; 294 
failed in one criterion; and 102 other sources, including 3C273, failed in several. 
The 212 sources in the best category were used to define the reference frame. 
Only 27% of these are in the Southern Hemisphere. A global solution provides the 
positions of the sources together with the antenna positions and various geodetic and 
atmospheric parameters. Position errors of the 212 defining sources are mostly less 
than 0.5 mas in both right ascension and declination and less than | mas in almost 
all cases. 

In 2009, an updated reference frame catalog called ICRF2 was released (Fey et al. 
2009, 2015) and was adopted by the IAU. It contains the positions of 3,414 sources 
derived from 6.5 x 10° measurements of group delay acquired over 30 years through 
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1998. About 28% of the data came from the VLBA. The core reference frame is 
based and maintained on data from 295 sources, whose distribution is much more 
uniform over the celestial sphere than those in ICRF1. The positional accuracy is 
about 40 uas, about five times better than achieved in ICRF1. 

About 50% of sources in the ICRF have redshifts greater than 1.0. The use of 
such distant objects to define the reference frame provides a level of astrometric 
uncertainty at least an order of magnitude better than optical measurements of 
stars. The ultimate accuracy of this frame may depend on the structural stability 
of the radio sources involved [see, e.g., Fey and Charlot (1997), Fomalont et al. 
(2011)]. The level of uncertainty in the connection between the radio and optical 
frames is essentially the uncertainty in optical positions. Radio measurements of the 
positions of some of the nearer stars provide a comparison between the radio and 
optical frames. Lestrade (1991) and Lestrade et al. (1990, 1995) have measured 
the positions of about ten stars by VLBI, achieving accuracy in the range 0.5- 
1.5 mas. These results provide a link between the ICRF and the star positions in 
the Hipparcos catalog. The visual magnitudes of the known optical counterparts of 
the reference frame sources are mostly within the range 15-21, and precise positions 
of objects fainter than 18th magnitude are likely to be very difficult to obtain. 

There are several methods of linking the extragalactic reference frame to 
the heliocentric reference frame. Pulsar positions can be derived from timing 
measurements and VLBI measurements (Bartel et al. 1985; Fomalont et al. 1992; 
and Madison et al. 2013). The timing analysis is inherently linked to the heliocentric 
frame. VLBI of space probes in orbit around solar system bodies can also help link 
the frames [see Jones et al. (2015)]. Radio observations of minor planets may be 
helpful (Johnston et al. 1982). 


12.2 Solution for Baseline and Source-Position Vectors 


We now discuss in a more formal way how interferometer baselines and source 
positions can be estimated simultaneously for phase, fringe rate, or group delay 
measurements. For discussion of early implementations of these techniques, see 
Elsmore and Mackay (1969), Wade (1970), and Brosche et al. (1973). An excellent 
tutorial is Fomalont (1995). 


12.2.1 Phase Measurements 


Consider an observation with a two-element tracking interferometer of arbitrary 
baseline in which the source is unresolved. Let D; be the assumed baseline vector, in 
units of the wavelength, and (D, — ADy,) be the true vector. Similarly, let s be a unit 
vector indicating the assumed position of the source, and let (s — As) indicate the 
true position. Note that the convention used is A term = (approximate or assumed 
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value) — (true value). The expected fringe phase, using the assumed positions, is 
27 D; -s. The observed phase, measured relative to the expected phase, is a function 
of the hour angle H of the source given by 


A¢ (H) = 2x [D, +s — (D, — AD,)- (s — As)] + gin 
= 2x (AD, -s + D; - As) + din . (12.7) 
A second-order term involving AD) » As has been neglected since we assume that 
fractional errors in D; and s are small. 


The baseline vector can be written in terms of the coordinate system introduced 
in Sect. 4.1: 


Xi AX) 
D, =|% l], AD, = | AY, | , (12.8) 
Za AZ) 


where X,, Y}, and Z, form a right-handed coordinate system with Z, parallel to the 
Earth’s spin axis and X, in the meridian plane of the interferometer. The source- 
position vector can be specified in the (X,, Y}, Z,) system in terms of the hour angle 
H and declination 6 of the source by using Eq. (4.2): 


Sy cos ô cos H 
s= | sy | =]-—cosésinH | . (12.9) 
SZ sin ô 


Taking the differential of Eq. (12.9), we can write 


—sindcosHA6 + cos ô sinHAa 
As ~ | sinô sin HAS + cos ô cos H Ag ; (12.10) 
cos 6A6é 


where Aw and Aé are the angular errors in right ascension and declination. Note 
that Aw = —AH [see Eq. (12.3)]. 

Consider the case in which there exists a catalog of sources whose positions 
are considered to be known perfectly. Most connected-element arrays, e.g., ALMA, 
the VLA, the SMA, and IRAM, have far more antenna pads than antennas, so 
that the arrays can be reconfigured for various resolutions. Each time an array 
is reconfigured, the baselines must be redetermined because of the mechanical 
imprecision of positioning the antennas on the pads. With only baseline errors, i.e., 
As = 0, the residual phase (substitute Eqs. (12.8) and (12.9) into (12.7)] is 


Ad(A) = o + ġı cos H + Qz sin# , (12.11) 
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where 


do = 2r sindAZ, + din , 
od, =2x cos AX; , (12.12) 
oo = — 2x cos AY} . 


A long track on a single source can be fitted to a sinusoidal function in H with 
three free parameters, ġo, %1, and ¢2. AX, and AY, can be found from ġ; and go, 
respectively. To separate the instrumental term from AZ), it is necessary to observe 
several sources. A simple graphical analysis would be to plot œo vs. sin ô for these 
sources; AZ, is given by the slope, i.e., déo/d(sin ô), and di, by the sinô = 0 
intercept. 

In the general case, as encountered in geodetic applications of VLBI, it is 
necessary to determine both baseline and source positions. Here, the residual phase 
[substitute Eqs. (12.8)—-(12.10) into (12.7)] is the same as Eq. (12.11) but with 


oo = 2x (sindAZ, + Z} cosdAS) + Qin , 
Qı = 2m (cosdAX, + Y, cosdAa — X, sindAd) , (12.13) 
odo = 2m (—cosdAY, + X, cosdAa + Y, sindAd) . 


Interleaved observations need to be made of a set of sources over a period of 
~ 12h. Three parameters (ġo, ¢1, and ġ2) can be derived for each source. If 
ns sources are observed, 3n, quantities are obtained. The number of unknown 
parameters required to specify the ns positions, the baseline, and the instrumental 
phase (assumed to be constant) is 2m, + 3; the right ascension of one source is 
chosen arbitrarily. Thus, if ns > 3, it is possible to solve for all the unknown 
quantities. Note that the sources should have as wide a range in declination as 
possible in order to distinguish AZ from gj, in Eq. (12.12). Least-mean-squares 
analysis provides simultaneous solutions for the instrumental parameters and the 
source positions. Usually, many more than three sources are observed, so there is 
redundant information, and variation of the instrumental phase with time as well 
as other parameters can be included in the solution. A discussion of the method of 
least-mean-squares analysis can be found in Appendix 12.1. 

Most astronomers are concerned with measuring the position of a source of 
interest with respect to a nearby calibrator taken from the ICRF or other catalog 
on an interferometer with well-calibrated baselines. In this case, the phase terms for 
Eq. (12.11) are 


po =27 Z; cosbA6d + Qin , 
go, =2x (Y, cosdAa — X, sindAd) , (12.14) 
h2 =2x (X, cosdAaw + Yı sindAd) . 
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However, the fringe visibility of a point source at position l = Aacosé and m = 
A6é is 


V= Voe ee eee (12.15) 


Thus, the source can be imaged by the usual interferometric techniques and its 
position determined by fitting a Gaussian (or similar) profile to the image plane. 
The accuracy of position determined in this manner will be limited by the thermal 
noise to approximately 


1 Ore S 


0g X 5 
°? = TRga 


(12.16) 


where 6,5 is the resolution of the interferometer, and R,, is the signal-to-noise 
ratio (SNR) [Reid et al. 1988, Condon 1997; see also Eq. (10.68)]. It is shown in 
Appendix A12.1.3 that the Fourier transform used in imaging is equivalent to a grid 
parameter search with trial values of œ and ô. However, to find baseline parameters 
or to analyze complex data sets, it is necessary to perform the data analysis in the 
(u, v) plane. 


12.2.2 Measurements with VLBI Systems 


The use of independent local oscillators at the antennas in VLBI systems does not 
easily permit calibration of absolute fringe phase. The earliest method used for 
obtaining positional information in VLBI was the analysis of the fringe frequency 
(fringe rate). The fringe frequency is the time rate of change of the interferometer 
phase. Thus, from Eq. (12.2), the fringe frequency is 


vp = ——-— = —@-,D,cosd cos ô sin(H — h) + vin , (12.17) 


where œe is the angular velocity of rotation of the Earth (dH/dt), and vin is an 
instrumental term equal to di, /dt. The component vin largely results from residual 
differences in the frequencies of the hydrogen masers, which provide the local 
oscillator references at the antennas. 

The quantity D, cosd is the projection of the baseline in the equatorial plane, 
denoted Dz. Thus, Eq. (12.17) can be rewritten 


vp = —@eDe cos ô sin(H — h) + vin . (12.18) 
The polar component of the baseline (the projection of the baseline along the polar 


axis) does not appear in the equation for fringe frequency. An interferometer with 
a baseline parallel to the spin axis of the Earth has lines of constant phase parallel 
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to the celestial equator, and the interferometer phase does not change with hour 
angle. Therefore, the polar component of the baseline cannot be determined from 
the analysis of fringe frequency. 

The usual practice in VLBI is to refer hour angles to the Greenwich meridian. 
We follow this convention and use a right-handed coordinate system with X through 
the Greenwich meridian and with Z toward the north celestial pole. Thus, in terms 
of the Cartesian coordinates for the baseline, Eq. (12.18) becomes 


vp = —@- cos 6 (X, sin H + Y, cos H) + vin . (12.19) 


The residual fringe frequency Avy, that is, the difference between the observed 
and expected fringe frequencies, can be calculated by taking the differentials of 
Eq. (12.19) with respect to ô, H, X}, and Y, and also including the unknown quantity 
Vin. We thereby obtain 


Avy = a, cos H + asin H + vin, (12.20) 
where 
a, = wel Y, sindAd + X, cosdAa — cos AY, ) (12.21) 
and 
a = W(X, sindAé — Y,cos dbAa — cos AX) . (12.22) 


Note that Avy is a diurnal sinusoid and that the average value of Avy is the 
instrumental term vin. Information about source positions and baselines must come 
from the two parameters a; and a2. Therefore, unlike the case of fringe phase 
[Eq. (12.11)] where three parameters per source are available, it is not possible 
to solve for both source and baseline parameters with fringe-frequency data. For 
example, from observations of n, sources, 2n; + 1 quantities are obtained. The total 
number of unknowns (two baseline parameters, 2n; source parameters, and Vin) is 
2ns + 3. If the position of one source is known, the rest of the source positions and 
Xj, Y), and vi, can be determined. Note that the accuracy of the measurements of 
source declinations is reduced for sources close to the celestial equator because of 
the sin ô factor in Eqs. (12.21) and (12.22). 

As an illustration of the order of magnitude of the parameters involved in fringe- 
frequency observations, consider two antennas with an equatorial component of 
spacing equal to 1000km and an observing wavelength of 3 cm. Then Dg ~ 3 x 107 
wavelengths, and the fringe frequency for a low-declination source is about 2 kHz. 
Assume that the coherence time of the independent frequency standards is about 
10 min. In this period, 10° fringe cycles can be counted. If we suppose that the phase 
can be measured to 0.1 turn, vp will be obtained to a precision of 1 part in 10’. The 
corresponding errors in Dg and angular position are 10 cm and 0.02”, respectively. 
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To overcome the limitations of fringe-frequency analysis, techniques for the 
precise measurement of the relative group delay of the signals at the antennas 
were developed. The use of bandwidth synthesis to improve the accuracy of delay 
measurements has been discussed in Sect. 9.8. The group delay is equal to the 
geometric delay t, except that, as measured, it also includes unwanted components 
resulting from clock offsets at the antennas and atmospheric differences in the signal 
paths. The fringe phase measured with a connected-element interferometer observ- 
ing at frequency v is 27vt,, modulo 27. Except for the dispersive ionosphere, the 
group delay therefore contains the same type of information as the fringe phase, 
without the ambiguity resulting from the modulo 27 restriction. Thus, group delay 
measurements permit a solution for baselines and source positions similar to that 
discussed above for connected-element systems, except that clock offset terms also 
must be included. 

It is interesting to compare the relative accuracies of group delay and the fringe 
frequency (or, equivalently, the rate of change of phase delay) measurements. The 
intrinsic precision with which each of these quantities can be measured is derived in 
Appendix 12.1 [Eqs. (A12.27) and (A12.34)] and can be written 


J ee : (12.23) 
o = | — | = | ——— . 
i 2m? DU) VAt 


and 


1 1) 1 
PES By (12.24) 
N 822 (; Vv AvtAvrms 


where of and o; are the rms errors in fringe frequency and delay, Ts and 7, are the 
system and antenna temperatures, Av is the IF bandwidth, t is the integration time, 
and Avyms is the rms bandwidth introduced in Sect. 9.8 [see also Eqs. (A12.32) and 
related text in Appendix 12.1]. Avyms is typically 40% of the spanned bandwidth. 
For a single rectangular RF band, Avy; = Av//12. To express the measurement 
error as an angle, recall that the geometric delay is 


D 
Tg = 7 088 s (12.25) 


where 0 is the angle between the source vector and the baseline vector. Thus, the 
sensitivity of the delay to angular changes is 


Ay D 
ae => sin, (12.26) 
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where 40, is the increment in 0 corresponding to an increment At, in Tg. Similarly, 
the sensitivity of the fringe frequency to angular changes [since vy = v(dt,/dt)] is 
(for an east-west baseline) 


A 
ie = Diosin ð , (12.27) 


where A6y is the increment in 0 corresponding to an increment Avy in vr. Thus, 
by setting Avy = oy and At, = o; and ignoring geometric factors, we obtain the 
equation 


Ad, te 
~ 20 2 


Ab Av/v’ 


(12.28) 


where te = 27 /œe is the period of the Earth’s rotation. Equation (12.28) describes 
the relative precision of delay and fringe-frequency measurements. In practice, 
measurements of delay are generally more accurate because of the noise imposed 
by the atmosphere. Measurements of fringe frequency are sensitive to the time 
derivative of atmospheric path length, and in a turbulent atmosphere, this derivative 
can be large, while the average path length is relatively constant. Note that fringe- 
frequency and delay measurements are complementary. For example, with a VLBI 
system of known baseline and instrumental parameters, the position of a source 
can be found from a single observation using the delay and fringe frequency 
because these quantities constrain the source position in approximately orthogonal 
directions. The earliest analyses of fringe-frequency and delay measurements to 
determine source positions and baselines were made by Cohen and Shaffer (1971) 
and Hinteregger et al. (1972). 

The accuracy with which group delay can be used to measure a source position 
is proportional to the reciprocal of the bandwidth 1/Av. Similarly, the accuracy 
with which phase can be used to measure a source position is proportional to the 
reciprocal of the observing frequency 1/v. Since the proportionality constants are 
approximately the same, the relative accuracy of these techniques is v/Av. This 
ratio of the observing frequency to the bandwidth, including effects of bandwidth 
synthesis, is commonly one to two orders of magnitude. On the other hand, the 
antenna spacings used in VLBI are one to two orders of magnitude greater than 
those used in connected-element systems. Thus, the accuracy of source positions 
estimated from group delay measurements with VLBI systems is comparable to the 
accuracy of those estimated from fringe phase measurements on connected-element 
systems having much shorter baselines. VLBI position measurements using phase 
referencing, as described below, are the most accurate of radio methods. 

The ultimate limitations on ground-based interferometry are imposed by the 
atmosphere. Dual-frequency-band measurements effectively remove ionospheric 
phase noise (see Sect. 14.1.3). The rms phase noise of the troposphere increases 
about as d°/°, where d is the projected baseline length, for baselines shorter than 
a few kilometers [see Eq. (13.101) and Table 13.3]. In this regime, measurement 
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accuracies of angles improve only slowly with increasing baseline length. For 
baselines greater than ~ 100 km, the effects of the troposphere above the 
interferometer elements are uncorrelated, and the measurement accuracy might be 
expected to improve more rapidly with baseline length. However, for widely spaced 
elements, the zenith angle can be significantly different, and the atmospheric model 
becomes very important. 


12.2.3 Phase Referencing (Position) 


In VLBI measurements of relative positions of closely spaced sources, it is 
possible to measure the relative fringe phases and thus obtain positional accuracy 
corresponding to the very high angular resolution inherent in the long baselines. The 
most accurate measurements can be made when the sources are sufficiently close 
that both fall within the antenna beams [see, e.g., Marcaide and Shapiro (1983) 
and Rioja et al. (1997)] or when they are no more than a few degrees apart so that 
tropospheric and ionospheric effects are closely matched (Shapiro et al. 1979; Bartel 
et al. 1984; Ros et al. 1999). In such cases, one source can be used as a calibrator in 
a manner similar to that for phase calibration in connected-element arrays. In VLBI, 
this procedure is referred to as phase referencing. It allows imaging of sources 
for which the flux densities are too low to permit satisfactory self-calibration. The 
description here follows reviews of phase referencing procedures by Alef (1989) 
and Beasley and Conway (1995). 

In phase referencing observations, measurements are made alternately on the 
target source and on a nearby calibrator, with periods on the order of a minute on 
each. (Note that the calibrator is also referred to as the phase reference source.) The 
rate of change of phase during these measurements must be slow enough that, from 
one calibrator measurement to the next, it is possible to interpolate the phase without 
ambiguity factors of 277. It is therefore necessary to use careful modeling to remove 
geodetic and atmospheric effects, including tectonic plate motions, polar motion, 
Earth tides, and ocean loading, and to make precise corrections for precession and 
nutation on the source positions. More subtle effects may need to be taken into 
account; for example, gravitational distortions of antenna structures, which tend 
to cancel out in connected-element arrays, can affect VLBI baselines because of 
the difference in elevation angles at widely spaced locations. Phase referencing 
has become more useful as better models for these effects, together with increased 
sensitivity and phase stability of receiving systems, have been developed. 

Consider the case in which we observe the calibration source at time t, then the 
target source at time fy, and then the calibrator again at time t3. For any one of these 
observations, the measured phase is 


Pmeas = xis + Pinst + Ppos + Pant + Patmos + Pionos ; (12.29) 
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where the terms on the right side are, respectively, the components of the phase 
due to the source visibility, instrumental effects (cables, clock errors, etc.), the error 
in the assumed source position, errors in assumed antenna positions, the effect of 
the neutral atmosphere, and the effect of the ionosphere. To correct the phase of 
the target source, we need to interpolate the measurements on the calibrator at t; 
and f3 to estimate what the calibrator phase would have been if measured at t, 
and then subtract the interpolated phase from the measured phase for the target 
source. If the positions of the target source and the calibrator are sufficiently close 
on the sky (not more than a few degrees apart), lines of sight from any antenna 
to the two sources will pass through the same isoplanatic patch, so the differences 
in the atmospheric and ionospheric terms can be neglected. We can assume that 
the instrumental terms do not differ significantly with small position changes, and 
if the calibrator is unresolved, then its visibility phase is zero. If the calibrator is 
partially resolved, it should be strong enough to allow imaging by self-calibration, 
and correction can be made for its phase. Thus, the corrected phase of the target 
source reduces to 


p= 8 = Bis + (Phos — Bos) » (12.30) 


where the superscripts ¢ and c refer to the target source and calibrator, respectively, 
and the tilde indicates interpolated values. The right side of Eq. (12.30) depends only 
on the structure and position of the target source, and the position of the calibrator. 
Figure 12.1 shows an example of phase referencing in which fringe fitting was 
performed on the data for the reference source, that is, determination of baseline 
errors, offsets between time standards at the sites, and instrumental phases. The 
results for the phase reference source (calibrator) are shown as crosses, and the 
resulting phase and phase rate corrections were interpolated to the times of the data 
points for the target source, shown as open squares. The corrected phases for the 
target source are shown in the lower diagram. For fringe fitting, it is desirable to 
have a source that is unresolved and provides a strong signal, so a phase reference 
source should be chosen for these characteristics when the target source is weak or 
resolved. 

Of the various effects in Eq. (12.29) that are removed by phase referencing, those 
that vary most rapidly with time are the atmospheric ones, and at frequencies above a 
few gigahertz, they result primarily from the troposphere rather than the ionosphere. 
Thus, at centimeter wavelengths, the tropospheric variations limit the time that 
can be allowed for each cycle of observation of the target and calibrator sources. 
Variations resulting from a moving-screen model of the troposphere are described in 
Section 13.1.6; the characteristics of the screen are based on Kolmogorov turbulence 
theory (Tatarski 1961). The relative rms variation in phase for the target and 
calibrator sources, the rays from which pass through the atmosphere a distance d;c 


apart, is proportional to d?! 6, 


o= ods, (12.31) 
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Fig. 12.1 An example of phase referencing with the VLBA. The data are from the Brewster—Pie 
Town baseline with an observing frequency of 8.4 GHz. The top figure shows the uncalibrated data 
for two sources: 1638+398 (the target source, open squares) and 1641+399 (the phase reference 
source, crosses). The bottom figure shows the data for 1641+399 after fringe fitting, and the data 
for 1638+398 after phase referencing, using 1641+399 as the reference source. From Beasley and 
Conway (1995), courtesy of and © the Astronomical Society of the Pacific. 
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where oo is the phase variation for 1-km ray spacing. In order to be able to 
interpolate the VLBI phase reference values from one calibrator observation to 
the next without ambiguity in the number of turns, the rms path length should 
not change by more than ~ 4/8 between successive calibrator scans. Then if the 
scattering screen moves horizontally with velocity vs, the criterion above results in 
a limit on the time for one cycle of the target source and calibrator, teyc. To determine 
this limit, we put de = Usteyc, and from Eq. (12.31) obtain 


6/5 
T =] 
foye < ( ) vel, (12.32) 


800 


This result can be used to illustrate the time limit on the switching cycle. The 
empirical data in Table 13.4 show that at A = 6cm (5 GHz frequency), the typical 
rms delay path is about 1 mm for d = 1 km, at the VLA site. The corresponding 
value of do for 6-cm wavelength is 6°, and for v, = 0.01 km sl, teye < 19min. This 
result is for typical conditions at the VLA site. For the same location and 1-km ray 
spacing, but under conditions described as “very turbulent,’ Sramek (1990) gives a 
value of 7.5mm for the rms path deviation. The value of oo for 6-cm wavelength 
is then 45°, resulting in tyc < 1.7min. The elevation angle of the source was 
not less than 60° for this last observation, so even shorter switching times could 
apply at lower elevation angles. Specific recommendations for cycle times in VLBI 
applications are given by Ulvestad (1999). 

At frequencies below ~ 1 GHz, the ionosphere becomes the limiting factor and 
medium-scale traveling ionospheric disturbances (MSTIDs), which have velocities 
of 100-300 m s~! and wavelengths up to several hundred kilometers, become impor- 
tant (Hocke and Schlegel 1996). Phase fluctuations resulting from the ionosphere 
or troposphere are minimized in the approximate range 5-15 GHz, in which good 
performance can be obtained by phase referencing in VLBI. 

There are also limits on the angular range that should be used in switching to the 
phase reference source, since even with a static atmosphere phase, errors are intro- 
duced that increase with switching angle. Phase referencing over 3° with 50-jas 
precision has been demonstrated by Reid et al. (2009) and Reid and Honma (2014). 

The offsets and uncertainties in the geometric parameters of the interferometer 
cause residual errors that scale in proportion to the angular separation between the 
target source and the reference or calibrator source. To first order, an offset in the 
position of the calibrator source is simply transferred to the estimate of the position 
of the target source. This is because the (u, v) coordinates of the target and calibrator 
sources can be considered to be the same for a separation of a few degrees or less. 
However, second-order corrections can be important. The total fringe phase [see 
Eq. (12.1)] is 27D, cos 0. For a target source in the direction 0, and a calibrator 
source in direction 6,, the difference in these interferometer phases will be 


Ad = Qı — be = 2x D3 (cos 6; — cos 6e) . (12.33) 
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Since cos(6,.) ~ cos 0, — sin 6,(@. — 4), 
Ad ~ 27D; sin 0; ep , (12.34) 
where Osep = 0; — Oe. 
Now we include the effect of a change in phase for an error in the baseline of 
AD),, giving a second-order phase term of 
A’p ~ 27 AD; sin 0, Occep , (12.35) 
or, ignoring the geometric factor, 
AQ ~ 27 AD3bsep - (12.36) 
This is the phase that affects astrometric accuracy. Hence, the effect of the baseline 


error on the phase is reduced by the factor Osep. The phase shift has an equivalent 
position offset of 


A2 
Ap = p = =a 
27D) D 


Osep - (12.37) 


With D = 8000km, A = 4cm, and sep = 1 degree (0.017 radians), D} = 108, and 
the resolution is 6s = 1/D, = 1 mas. An error in the baseline of 2 cm would cause 
a phase error of about 3 degrees, corresponding to an angle of 9 uas. 

Equation (12.37) provides an excellent rule of thumb for astrometric accuracy, 
and AD/D can be interpreted as a rotational angular error in the baseline, or AD 
can be replaced by cAt, where At might characterize the atmospheric delay error. 

Similarly, if there is an error in the calibrator position of AQ, then, from 
Eq. (12.34), there will be an error in the phase of 


A’d ~ 27D, cos 6; Abe Absep » (12.38) 


where we assume sin 0, ~ sin @,. Again, ignoring the trigonometric factor, we can 
write 


A Oe Osep 


A? ~ 20D, Abe Osep ~ 27 7 


(12.39) 


If astrometry is done in the image plane with an array, then there will be a variety 
of phase errors of magnitude equal to Eq. (12.38) but differing by the trigonometric 
factors of the various array baselines and can be thought of as the rms phase noise. 
Then the image will be substantially degraded when A?°ġ ~ 1. To meet this criterion 
with D = 8000km, A = 4 cm, and Op = 1 degree, the calibrator position must 
be known to about 10 mas. With an rms phase error of 1 radian, the visibility of a 
point source is reduced by a factor of exp(—2¢7/2) ~ 0.6. For 2 radians of phase 
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error, the image would be destroyed. The equivalent angular offset for the phase 
shift given in Eq. (12.39) is 


AO ~ Abe Oey - (12.40) 


Note that AQ, plays the same role as AD/D in Eq. (12.37). Hence, an astrometric 
accuracy of 10 uas requires position error in the calibrator of 150 jas or less. An 
analysis in the (u, v) plane is given in Appendix 12.2. 


12.2.4 Phase Referencing (Frequency) 


At millimeter wavelengths, position phase referencing becomes more difficult 
because calibration sources are generally weaker and more sparsely distributed than 
at longer wavelengths. Coherence times are also shorter, e.g., a few tens of seconds 
above 100 GHz, requiring rapid antenna pointing changes for calibration by position 
switching. In this case, frequency switching on the target source itself is a valuable 
calibration technique. The goal is to remove effects in the fringe phase that scale as 
frequency, i.e., that can be characterized by a nondispersive or constant delay. ġe, 
the phase at the lower frequency (ve), is used to calibrate the phase at the higher 
frequency (v,), ¢; by forming the quantity 


b = hi- Rẹ: , (12.41) 


where R is the ratio of frequencies v,/v,. This procedure will remove the effects 
of the atmosphere and the frequency standards but not the ionosphere or other 
dispersive processes. Note that at low frequencies, the goal of dual frequency 
calibration is to remove the ionospheric delay, and R in Eq. (12.41) is replaced by 
1/R (e.g., see Sects. 12.6 and 14.1.3). It is usually convenient to choose R to be an 
exact integer to avoid the need to deal with phase wrap issues. To see this, focus 
on the term that describes the tropospheric excess path length L. For this exercise, 
be = 2n vL/c + 27n. and ¢, = 27v,L/c + 27n, where n, and n; are the integers 
that characterize the phase wraps. The calibrated phase is thus 


Q = 27 (ne — Rn) . (12.42) 


which will be an integral multiple of 27 if R is an integer. An early demonstration 
of this technique was carried out by Middelberg et al. (2005), who used phases 
at 14.375 GHz to calibrate those at 86.25 GHz (R = 6). The residual error phase 
caused by the ionosphere and electronic drifts in local oscillator chains have a much 
longer timescale than tropospheric variations. These can be removed by adding a 
slower position switching cycle. The efficacy of this double switching technique has 
been demonstrated by Rioja and Dodson (2011), Rioja et al. (2014, 2015), and Jung 
et al. (2011). If the source structure consists of a compact core at both frequencies, 
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as in many AGN, the shift in position with frequency caused by opacity effects in 
the source can be an important physical diagnostic. This shift can be accurately 
measured by application of frequency/position switching calibration. 


12.3 Time and Motion of the Earth 


We now consider the effect of changes in the magnitude and direction of the Earth’s 
rotation vector on interferometric measurements. These changes cause variations in 
the apparent celestial coordinates of sources, the baseline vectors of the antennas, 
and universal time. The variations of the Earth’s rotation can be divided into three 
categories: 


1. There are variations in the direction of the rotation axis, resulting mainly from 
precession and nutation of the spinning body. Since the direction of the axis 
defines the location of the pole of the celestial coordinate system, the result is a 
variation in the right ascension and declination of celestial objects. 

2. The axis of rotation varies slightly with respect to the Earth’s surface; that is, 
the positions on the Earth at which this axis intersects the Earth’s surface vary. 
This effect is known as polar motion. Since the (X, Y, Z) coordinate system of 
baseline specification introduced in Sect. 4.1 takes the direction of the Earth’s 
axis as the Z axis, polar motion results in a variation of the measured baseline 
vectors (but not of the baseline length). It also results in a variation in universal 
time. 

3. The rate of rotation varies as a result of atmospheric and crustal effects, and this 
again results in variation in universal time. 


We briefly discuss these effects. Detailed discussions from a geophysical view- 
point can be found in Lambeck (1980). 


12.3.1 Precession and Nutation 


The gravitational effects of the Sun, Moon, and planets on the nonspherical Earth 
produce a variety of perturbations in its orbital and rotational motions. To take 
account of these effects, it is necessary to know the resulting variation of the 
ecliptic, which is defined by the plane of the Earth’s orbit, as well as the variation 
of the celestial equator, which is defined by the rotational motion of the Earth. 
The gravitational effects of the Sun and Moon on the equatorial bulge (quadrupole 
moment) of the Earth result in a precessional motion of the Earth’s axis around the 
pole of the ecliptic. 

The Earth’s rotation vector is inclined at about 23.5° to the pole of its orbital 
plane, the ecliptic. The period of the resulting precession is approximately 
26,000 years, corresponding to a motion of the rotation vector of 20” per year 
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[27 sin(23.5°)/26, 000 radians per year]. The 23.5° obliquity is not constant but is 
currently decreasing at a rate of 47” per century, due to the effect of the planets, 
which also cause a further component of precession. The lunisolar and planetary 
precessional effects, together with a smaller relativistic precession, are known as 
the general precession. Precession results in the motion of the line of intersection 
of the ecliptic and celestial equator. This line, called the line of nodes, defines 
the equinoxes and the zero of right ascension, which precess at a rate of 50” per 
year. In addition, the time-varying lunisolar gravitation effects cause nutation of 
the Earth’s axis with periods of up to 18.6 years and a total amplitude of about 
9”. The principal variations of the ecliptic and equator are those just described, 
but other smaller effects also occur. The general accuracy within which positional 
variations can be calculated is better than 1 mas (Herring et al. 1985). Expressions 
for precession can be found in Lieske et al. (1977) and for nutation in Wahr (1981). 
The required procedures are discussed in texts on spherical astronomy, such as 
Woolard and Clemence (1966), Taff (1981), and Seidelmann (1992). 

Since precession and nutation result in variations in celestial coordinates that can 
be as large as 50” per year for objects at low declinations, these effects must be taken 
into account in almost all observational work, whether astrometric or not. Positions 
of objects in astronomical catalogs are therefore reduced to the coordinates of 
standard epochs, B1900.0, B1950.0, or J2000.0. These dates denote the beginning of 
a Besselian year or Julian year, as indicated by the B or J. The positions correspond 
to the mean equator and equinox for the specified epoch, where “mean” indicates 
the positions of the equator and equinox resulting from the general precession, 
but not including nutation. For further explanation and a discussion of a method 
of conversion between standard epochs, see Seidelmann (1992). Correction is also 
required for aberration, that is, for the apparent shift in position resulting from the 
finite velocity of light and the motion of the observer. Two components are involved: 
annual aberration resulting from the Earth’s orbital motions, which has a maximum 
value of about 20”; and diurnal aberration resulting from the rotational motion, 
which has a maximum value of 0.3”. The retarded baseline concept (Sect. 9.3) used 
in VLBI data reduction accounts for the diurnal aberration. For the nearer stars, 
corrections for proper motion (i.e., actual motion of the star through space) are 
required and in some cases also for the parallax resulting from the changing position 
of the Earth in its orbit (see Sect. 12.5). The impact of radio techniques, particularly 
VLBI, is resulting in refinement of the classical expressions and parameters. Effects 
such as the deflection of electromagnetic waves in the Sun’s gravitational field must 
also be included in positional work of the highest accuracy (see Sect. 12.6). 


12.3.2 Polar Motion 


The term polar motion denotes the variation of the pole of rotation of the Earth 
(the geographic pole) with respect to the Earth’s crust. This results in a component 
of motion of the celestial pole that is distinct from precessional and other motions. 
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Polar motion is largely, but not totally, of geophysical origin. The motion of the 
geographic pole around the pole of the Earth’s figure is irregular, but over the last 
century, the distance between these two poles wandered by up to 0.5”, or 15m on 
the Earth’s surface. In a year’s time, the excursion of the figure axis is typically 
6m or less. The motion can be analyzed into several components, some regular and 
some highly irregular, and not all are understood. The two major components have 
periods of 12 and 14 months. The 12-month component is a forced motion due to the 
annual redistribution of water and of atmospheric angular momentum and is far from 
any resonance. The 14-month component, known as the Chandler wobble (Chandler 
1891), is the motion at a resonance frequency whose driving force is unknown. For 
a more detailed description, see Wahr (1996). 

The motion of the pole of rotation is measured in angle or distance in the x and 
y directions, as shown in Fig. 12.2. The (x, y) origin is the mean pole of 1900-1905, 
which is referred to as the conventional international origin (CIO), and the x axis is 
in the plane of the Greenwich meridian (Markowitz and Guinot 1968). Since polar 
motion is a small angular effect, it can often be ignored in imaging observations, 
especially if the visibility is measured with respect to a calibrator that is only a few 
degrees from the center of the field being imaged. 


Fig. 12.2 Coordinate system 
for the measurement of polar 
motion. The x coordinate is in 
the plane of the Greenwich 
meridian, and the y axis is 
90° to the west. CIO is the 
conventional international 
origin. 
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12.3.3 Universal Time 


Like the motion of the Earth, the system of timekeeping based on Earth rotation is a 
complicated subject, and for a detailed discussion, one can refer to Smith (1972) or 
to the texts mentioned in the discussion of precession and nutation above. We shall 
briefly review some essentials. Solar time is defined in terms of the rotation of the 
Earth with respect to the Sun. In practice, the stars present more convenient objects 
for measurement, so solar time is derived from measurement of the sidereal rotation. 
The positions of stars or radio sources used for such measurements are adjusted for 
precession, nutation, and so on, and the resulting time measurements thus depend 
only on the angular velocity of the Earth and on polar motion. When converted to the 
solar timescale, these measurements provide a form of universal time (UT) known as 
UTO; this is not truly “universal” since the effects of polar motion, which can amount 
to about 35 ms, depend on the location of the observatory. When UTO is corrected 
for polar motion, the result is known as UT1. Since it is a measure of the rotation 
of the Earth relative to fixed celestial objects, UT1 is the form of time required 
in astronomical observing, including the analysis of interferometric observations, 
navigation, and surveying. However, UT1 contains the effects of small variations 
in the Earth’s rotation rate, attributable largely to geophysical effects such as the 
seasonal variations in the distribution of water between the surface and atmosphere. 
Fluctuations in the length of day over the period of a year are typically about 1 ms. 
To provide a more uniform measure of time, UT2 is derived from UT by attempting 
to remove seasonal variations. UT2 is rarely used. UT1 and UT2 include the effect 
of the gradual decrease of the rotation rate of the Earth. This causes the length of 
the UT1/UT2 day to increase slightly when compared with International Atomic 
Time (IAT), which is based on the frequency of the cesium line (see Sect. 9.5.4). 
The IAT second is the basis for another form of UT, Coordinated Universal Time 
(UTC), which is offset from IAT so that |UT1 — UTC| < 1 s. This relationship 
is maintained by inserting one-second discontinuities (leap seconds) in UTC when 
required on specified days of the year. 

The practice at many observatories is to maintain UTC or IAT using an atomic 
standard and then obtain UT1 from the published values of AUT1 = UT1 — UTC. 
Since AUTI is measured rather than computed, in principle it can be determined 
only after the fact. However, it is possible to predict it by extrapolation with 
satisfactory accuracy for periods of one or two weeks and thus to implement UT1 in 
real time. Values of AUT1 are available from the Bureau International de L’ Heure 
(BLH), which was established in 1912 at the Paris Observatory to coordinate 
international timekeeping, and from the U.S. Naval Observatory. Rapid service data 
are available from these institutions with a timeliness suitable for extrapolation. 
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12.3.4 Measurement of Polar Motion and UT1 


The classical optical methods of measuring polar motion and UT! are by timing the 
meridian transits of stars of known positions. Observations at different longitudes, 
using stars at more than one declination, are required to determine all three 
parameters (x, y, AUT 1). During the 1970s, it became evident that such astrometric 
tasks can also be performed by radio interferometry (McCarthy and Pilkington 
1979). 

To specify the baseline components of an interferometer for such measurements, 
we use the (X,Y,Z) system of Sect. 4.1, rotated so that the X axis lies in the 
Greenwich meridian instead of the local meridian. Let AX, AY, and AZ be the 
changes in the baseline components resulting from polar motion (x, y) (in radians) 
and a time variation (UT1 — UTC) corresponding to © radians. Then we may write 


AX 0-0 x X 
AY|=| © 0 -y/|Y]. (12.43) 
AZ —x y 0 Z 


where the square matrix is a three-dimensional rotational matrix valid for small 
angles of rotation. ©, x, and y are the rotation angles about the Z, Y, and X axes, 
respectively. From Eq. (12.43), we obtain 


AX =- OY +Z, 
AY = OX- yZ, (12.44) 
AZ =- xX + yY. 


Thus, if one observes a series of sources at periodic intervals and determines the 
variation in baseline parameters, Eqs. (12.44) can be used to determine UT1 and 
polar motion. For an interferometer with an east-west baseline (Z = 0), one can 
determine © but cannot separate the effects of x and y. An east-west interferometer 
located on the Greenwich meridian (X = Z = 0) would yield measures of © and y 
but not of x. If it had a north-south component of baseline (Z 4 0), one could still 
measure y but would not be able to separate the effects of x and ©. In general, one 
cannot measure all three quantities with a single baseline, since a single direction 
is specified by two parameters only. Systems suitable for a complete solution might 
be, for example, two east-west interferometers separated by about 90° in longitude 
or a three-element noncollinear interferometer. An example of VLBI measurements 
of the pole position is shown in Fig. 12.3. The Global Positioning System provides 
a method of making pole-position measurements [see, e.g., Herring (1999)]. 

The methods just described are applicable to observations using connected- 
element interferometers in which the phase can be calibrated, and also to VLBI 
observations in which the bandwidth is sufficient to obtain accurate group delay 
measurements. An example of VLBI determination of the length of day is shown 
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Fig. 12.3 VLBI determination of the position of the pole. The diamonds indicate measurements in 
which three or more stations particulated in the observations so that both pole coordinates could be 
determined from the observations. The crosses mark observations that employed only one baseline, 
for which only the x component of the pole was determined. The corresponding y components were 
obtained from the BLH. Note that 100 mas corresponds to 3.2 m. From Carter et al. (1985), © John 
Wiley & Sons. 


in Fig. 12.4. The data show an annual variation of about 2 ms, which is caused by 
the angular momentum exchange between the Earth and the atmosphere due to the 
difference in land mass in the Northern and Southern Hemispheres [see, e.g., Paek 
and Huang (2012)]. The trend in the long-term variation is thought to be due to an 
exchange of angular momentum between the Earth’s core and mantle. The effects 
of El Niño events can be seen in these data (Gipson and Ma 1999). A comparison of 
determinations of UT1 and polar motion by VLBI, satellite laser ranging, and BLH 
analyses of standard astrometric data is given in Robertson et al. (1983) and Carter 
et al. (1984). 

VLBI is a unique tool for the study of many phenomena related to Earth 
dynamics. For example, the period and amplitude of the free-core nutation has been 
estimated (Krásná et al. 2013). 
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Fig. 12.4 The length of day (LOD) with respect to the standard day of 86,400 s, as measured 
by VLBI, 1980-2015. Lunar tidal effects have been removed. The measurement accuracy is 
typically 20 us, which corresponds to a distance along the equator of 2 mm. A two-year (triangular 
weighting) running mean of the LOD data is shown by the heavy line. Data provided by John 
Gipson, NASA/GSFC. 


12.4 Geodetic Measurements 


Certain geophysical phenomena, for example, Earth tides (Melchior 1978) and 
movements of tectonic plates, can result in variations in the baseline vector of a 
VLBI system. Variations in the length of the baseline are clearly attributable to such 
phenomena, whereas variations in the direction can also result from polar motion 
and rotational variations. Magnitudes of the effects are of order 1—10 cm per year 
for plate motions and 30cm (diurnal) for Earth tides. They are thus measurable 
using the techniques of VLBI. Solid-Earth tides were first detected by Shapiro et al. 
(1974), and refined measurements were reported by Herring et al. (1983). In addition 
to solid-Earth tides, displacement of land masses resulting from tidal shifts of water 
masses, called ocean loading, is measurable. The earliest evidence of contemporary 
motion of tectonic plates was found by Herring et al. (1986), who reported that 
the increase in the baseline between Westford, MA, and Onsala, Sweden, based on 
data from 1980 to 1984, was 17 + 2 mm/yr. A plot of the extensive measurements 
of the Westford—Onsala baseline is shown in Fig. 12.5. For reviews of geodetic 
applications of VLBI, see Shapiro (1976), Counselman (1976), Clark et al. (1985), 
Carter and Robertson (1993), and Sovers et al. (1998). 
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Fig. 12.5 The baseline length between Westford, MA, and Onsala, Sweden, determined by 499 
VLB observations from 1981 to 2015. The formal error in recent measurements is typically less 
than 1 mm. The straight line fit to the data has a slope 16.34 + 0.04 mm/yr. For an analysis of 
short-term systematic trends in baseline-length data, see Titov (2002). These data were taken from 
the website of the International VLBI Service for Geodesy and Astrometry (http://ivscc.gsfc.nasa. 
gov/products-data). 


12.5 Proper Motion and Parallax Measurements 


The position of a relatively nearby star or radio source changes with respect to the 
distant background due to the annual motion of the Earth around the Sun. This 
effect is called annual parallax. It can be used to measure distances by the classical 
technique of trigonometric triangulation, first demonstrated by Bessel (1838) from 
optical observations of the star 61 Cygni. The parallax angle, I, is defined as one- 
half of the total excursion in apparent position over a year. The distance to the object, 
by simple trigonometry for small angles, is 


D = —. 12.45 
T ( ) 


By definition, an object with a parallactic angle of 1” has a distance of 1 parsec. 
Hence, a parsec is 206265 (the number of arcseconds in a radian) times the Sun— 
Earth distance [called the astronomical unit (AU)], or 3.1 x 10!8 cm. The AU, which 
is determined by ranging measurements of the planets and spacecraft, is called the 
first rung on the cosmic distance ladder. Its value is 1.4959787070000 x 10! cm, to 
an accuracy of about | part in 50 billion (Pitjeva and Standish 2009). The intrinsic 
motion of nearby objects can also be measured. This is called proper motion. The 
precision of VLBI astronometric measurements has greatly extended the distances 
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over which proper motions and parallaxes can be measured. If the parallax accuracy 
is oy, the uncertainty in the distance can be determined from the differential of 
Eq. (12.45), i.e., AD = TI? AT, to be op = D?op. Hence, the fractional distance 
accuracy is 


(2) = Dor. (12.46) 


The important result here is that the fractional distance accuracy grows with distance 
for a fixed positional accuracy. Hence, a fractional distance of 10% accuracy can be 
measured for objects to a distance of 10 pc with og = 0.01” (ground-based optical), 
or 100 pe with om = 1 mas (Hipparcos satellite), and 10* pe with om = 10 pas 
(VLBD. 


For measurements of parallax with better than 10% accuracy, om /IT < 0.1, the 


: 1 on, . ; ae 
distance estimate of D = — + — is essentially unbiased. The situation is more 


complex when the accuracy is lower. If the probability distribution for a parallax 
measurements is 


1 _ (1-0) 
pI) = Tin e h , (12.47) 
TOM 


where Io is the true, but unknown, parallax, then the probability distribution 


dII 
function of D, p (D) = p mg , is 


1 1 Z (4-10) 
~—e@ ?%”h (12.48) 
V 2no7 D? f ` 
p (D) becomes increasingly asymmetric with increasing or / IT and develops a long 


tail at large values of D. The expectation of D, i.e., 7 can be calculated from 
Eq. (12.47) by Taylor expansion of D = 7 which gives the result 


p(D) = 


iD) [i { zz | (12.49) 


For the case of a single source, an accepted strategy is to perform a Markov chain 
Monte Carlo (MCMC) analysis of the position-vs.-time data with D as a parameter 
and apply an appropriate prior distribution of D to estimate the final distribution, 
p (D). The difficulties of parallax analysis at low signal-to-noise ratio, including the 
Lutz—Kelker effect (Lutz and Kelker 1973), are discussed by Bailer-Jones (2015) 
and Verbiest and Lorimer (2014). 

Parallaxes have been measured to many pulsars (Verbiest et al. 2010, 2012). 
These may be compared with indirect estimates based on dispersion measures and 
galactic models of electron density. Precision parallax distances may prove to be 
important in the use of pulsar timing measurements to detect gravitational radiation 
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[see Madison et al. (2013)]. The distance to the pulsar PSR J2222-0137 has been 
determined to be 267.3556 pc, an accuracy of 0.4% (Deller et al. 2013). 

The star IM Peg, which has detectable radio emission, provides an example of 
a precise measurement of parallax with VLBI (Bartel et al. 2015). Its position was 
precisely determined over 39 epochs spanning six years so that it could be used as 
a guide star for the physics experiment Gravity Probe B (Everitt et al. 2011). The 
position of the radio star is shown in Fig. 12.6. The position shift is dominated by 
the proper motion. The annual parallax can be readily seen when this proper motion, 
modeled as a constant velocity vector, is removed, as shown in Fig. 12.7. 

An excellent example of the steady improvement in VLBI parallax measurements 
can be found in the work on the Orion Nebula, a galactic object of singular 
importance in astronomy. The results, shown in Table 12.1, made with a variety 
of continuum and spectral line sources over a considerable frequency range, have 
yielded a distance accurate to 1.5%. This corresponds to a parallactic accuracy 
[Eq. (12.46)] of ~ 30 uas. 
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Fig. 12.7 (left) Motion of IM Peg with proper motion and orbital motion removed. The parallactic 
angle II is the semimajor axis of the ellipse. (right) Relative position in Dec. (top) and RA 
(bottom) vs. solar longitude. Plots derived from the data in Table 1 of Ratner et al. (2012). 


There are other notable examples of parallax measurements. The distance to the 
Pleiades Cluster was determined by VLBI to be 136.2 + 1.2pc by Melis et al. 
(2014), resolving a long-standing discrepancy in its distance estimates. VLBI has 
also been used to detect the apparent motion of Sgr A*, the radio source at the 
center of the Galaxy, against the extragalactic background caused by the rotation of 
the Galaxy. These results are shown in Fig. 12.8. A combination of these data and 
parallax measurements with the VLBA, VERA, and EVN of more than 100 masers 


Table 12.1 VLBI parallax distance measurements to the Orion Nebula* 


Parallax Source No. of Freq. Distance 

method? Array type® epochs (GHz) (pc) Reference 

Expansion VLBI? H20 5 22 480+ 80 Genzel et al. (1981) 
Annual VLBA YSO 5 15 389+ 24 Sandstrom et al. (2007) 
Annual VERA H,O 16 22 437419 Hirota et al. (2007) 
Annual VLBA YSO 4 8 414+7 Menten et al. (2007) 
Annual VERA SiO 7 43 418 +6 Kim et al. (2008) 
Annual VLBA YSO 5 5 383 +5 Kounkel et al. (2016) 


“All of the sources in these studies were colocated within the Orion Nebula Cluster (ONC) to 
an angular distance of about +10’ or a projected distance of ~ 2 pc. Distance measurements 
before 1980 were in the range 300-540 pc; the distance measured by Hipparcos to one star was 
3617 18 pe (Bertout et al. 1999). 

‘Expansion parallax used a model of internal symmetrical expansion of 21 maser components. 
°H20 = water vapor masers; YSO = young stellar objects with nonthermal emission from GMR 
catalog (Garay et al. 1987); SiO = silicon monoxide masers. 

dAd hoc four-station VLBI array. 
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Fig. 12.8 Apparent positions of Sgr A*, the radio source in the Galactic center, relative to the 
extragalactic calibrator J1745-283, measured over an eight-year period at 43 GHz with the VLBA. 
The ellipses around the measurement points indicate the scatter-broadened size of Sgr A* (see 
Fig. 14.10). The one-sigma error bars in the measurements are also shown. The broken line is the 
variance-weighted least-mean-squares fit to the data, and the solid line indicates the orientation 
of the Galactic plane. The motion is almost entirely in galactic longitude, attributable to the solar 
motion around the center of the Galaxy of 241 + 15 km s~!, for a distance between the Sun and 
Galactic center of 8 + 0.5 kpc. The limit on the residual motion of Sgr A* is nearly two orders of 
magnitude less than that of the motions of stars lying within a projected distance of about 0.02 pc 
of Sgr A*. These stellar motions suggest that about 4.1 x 10° Mọ of matter are contained within 
0.02 pe of Sgr A*, and the lack of detected motion of Sgr A* itself suggests that a mass of at 
least 10° Mọ must be associated with the radio source Sgr A*. For a comparison of measurements 
made with the VLA, see Backer and Sramek (1999). From Reid and Brunthaler (2004). © AAS. 
Reproduced with permission. 
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gives Galactic structure parameters of Ro = 8.34-£0.16 pc and 6) = 2408 km s7! 
(Reid et al. 2014). 


12.6 Solar Gravitational Deflection 


The bending of electromagnetic radiation passing a massive body is described in the 
parametrized post-Newtonian formalism of general relativity (GR) by the parameter 
y and is normally written as [see, e.g., Misner et al. (1973), or Will (1993)] 


GM 
Ae = (1+ y)— (1 + cose). (12.50) 
pc 


G is the gravitational constant. M is the mass of the perturbing body, which we 
take to be the Sun in this discussion; p is the impact parameter (closest approach of 
the unperturbed ray to the Sun); and e€ is the elongation angle (the angle between 
the direction to the source and the direction to the Sun as seen by the observer). 
Equation (12.50) holds for sources at infinite distance. This parametrization reflects 
the fact that the bending predicted by Newtonian physics is exactly half the value 
predicted by GR, i.e., y = 1 for GR and 0 for Newtonian physics. GM /c? is known 
as the gravitational radius, which is 1.48 km for the Sun. For a ray path passing close 
to the Sun where € < 1, Eq. (12.50) can be approximated as 


2GM 
Ae = (bay) —— (12.51) 
pe 
For a ray grazing the surface of the Sun, where p = ro (corresponding to € = 


0.267°) and ro is the solar radius, the deflection angle is 1.75”. 

Equation (12.50) can be rewritten so as to eliminate p, since p = Rọ sine, where 
Ro is the distance from the Sun to the Earth. After some trigonometric manipulation, 
the deflection angle can be expressed as 


GM /1+ cose 
Ae = (1+ y)—~ | ——__.. 
= WR 1—cose 


(12.52) 
Ae declines monotonically with €, as shown in Fig. 12.9, and for y = 1 has a value 
of 4.07 mas at € = 90°, 1 mas at € = 150°, and O at € = 180°. Furthermore, two 
sources separated by 1° near €e = 90° will suffer a 70-jas shift in their relative 
positions. 

Shapiro (1967) first suggested that GR could be tested by observing the deflection 
of radio waves passing in the vicinity of the Sun. This is just the radio version of 
the famous optical experiment first performed in 1919 by the Eddington expedition 
(Dyson et al. 1920). For a long time, the radio astronomical experiments were based 
on the two sources 3C279 and 3C273, which are separated by about 10 degrees and 
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Fig. 12.9 The gravitational deflection, Aq, of a radio wave passing the Sun at an elongation angle, 
a, calculated from Eq. (12.52) when y = 1. 


pass fortuitously close in angle to the Sun each October. In fact, 3C279 is occulted 
by the Sun on October 8. The measurement of the change in relative position of 
these sources can be used to estimate y. The challenge of such measurements is to 
overcome the effects of the ionized plasmas surrounding the Sun, i.e., the corona 
and solar wind, whose effects diminish with distance from the Sun and as À? (see 
Sect. 14.3.1). Note that the ray bending caused by the solar plasma has the opposite 
sign as caused by GR, i.e., plasma bending makes sources appear closer to the Sun 
in angle and GR makes them appear farther. 

The first radio interferometry experiments were undertaken for the 1969 passage, 
one with two antennas forming an interferometer at the Owens Valley Radio 
Observatory at a frequency of 9.1 GHz and a baseline of 1.4km, and the other with 
two antennas forming an ad hoc interferometer at the JPL Goldstone facility at a 
frequency of 2.4GHz and a baseline of 21 km. The solar plasma was modeled by 
both groups as two power-law components with amplitude parameters estimated 
from the data (see Sect. 14.3.1). The results from both experiments confirmed GR 
to an accuracy of about 30%, with the JPL instrument’s advantage of longer baseline 
and higher resolution compensating for the OVRO instrument’s advantage of higher 
frequency. The experiment has been repeated many times with ever more sensitive 
equipment and more refined techniques, and the results are listed in Table 12.2. The 
first VLBI experiment was reported by Counselman et al. (1974) for an 845-km 
baseline between Haystack and NRAO. Each site employed two antennas so that 
two coherent interferometers were formed to track both sources simultaneously. 
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Another major step forward was the use of dual-frequency observations. This 
allowed a phase (or delay) observable to be formed from the two phases ¢; and ¢2 
measured at frequencies vı and v2, 


be = h2- (=) i. (12.53) 


from which the dispersive effects of the solar plasma (and ionosphere) are largely 
removed (see Sect. 14.1.3). The best result so far for the targeted 3C279/3C273 
experimentis y = 0.9998 +0.0003 with the VLBA at 15, 23, and 43 GHz (Fomalont 
et al. 2009). It should be noted that this result was based largely on the 43-GHz 
results alone, where the plasma effects were greatly diminished. Various changes 
are expected to be made that will improve this experiment by a factor of four (i.e., 
to a fractional accuracy of better than 1 part in 10*). 

In addition, the huge geodetic VLBI database has been used to estimate y, giving 
the results listed in Table 12.2. The analysis described by Lambert and Le Poncin- 
Lafitte (2011) is based on 5,055 observing sessions (1979-2010) of 3,706 sources 
and 7 million delay measurements. The postfit delay residual is 23 picoseconds, and 
y = 0.9992 + 0.0001. The continual accrual of geodetic data will lead to better 
results in the future. The best measurement overall of y to date, y = 1.000021 + 
0.000023, was made by analyzing the delay residuals from tracking the Cassini 
spacecraft as it passed the Sun in 2002 (Bertotti et al. 2003). 


12.7 Imaging Astronomical Masers 


In the envelopes of many newly formed stars, and also those of highly evolved stars 
and the accretion disks of AGN, radio emission from molecules such as H20 and 
OH is caused by a maser process. The frequency spectrum of the emission is often 
complicated, containing many spectral features or components caused by clouds 
of gas moving at different line-of-sight velocities. Maps of strong maser sources 
reveal hundreds of compact components with brightness temperatures approaching 
10!5 K, angular sizes as small as 1074 arcsec, and flux densities as high as 10° 
Jy. The components are typically distributed over an area of several arcseconds in 
diameter and a Doppler velocity range of 10-3000km s~! (0.7-200MHz for the 
H20 maser transition at 22 GHz). Individual features have line widths of about 
1km s~! or less (74kHz at 22 GHz). The physics and phenomenology of masers 
are discussed by Reid and Moran (1988); Elitzur (1992); and Gray (2012). The 
processing and analysis of maser data require large correlator systems because the 
ratio of required bandwidth to spectral resolution is large (10°—10*). They also 
require prodigious amounts of image processing because the ratio of the field of 
view to the spatial resolution is large (107-104). As an extreme example, the H2O 
maser in W49 has hundreds of features distributed over 3 arcsec (Gwinn et al. 1992). 
The complete mapping of this source at a resolution of 107°? arcsec with 3 pixels per 
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resolution interval would require the production of 600 maps, each with at least 10° 
pixels. However, most of the map cells would contain no emission. Thus, the usual 
procedure is to measure the positions of the features crudely by fringe-frequency 
analysis and then to map small fields around these locations by Fourier synthesis 
techniques. Examples of maps made by fringe-frequency analysis can be found 
in Walker et al. (1982); by phase analysis in Genzel et al. (1981) and Norris and 
Booth (1981); and by Fourier synthesis in Reid et al. (1980), Norris et al. (1982), 
and Boboltz et al. (1997). We shall briefly discuss some of the techniques used in 
mapping masers and their accuracies. Note that geometric (group) delays cannot be 
measured accurately because of the narrow bandwidths of the maser lines. 

In mapping masers, we must explicitly consider the frequency dependence of 
the fringe visibility. We assume that a maser source consists of a number of point 
sources. Furthermore, we assume that the measurements are made with a VLBI 
system and that the desired RF band is converted to a single baseband channel. 
Adapting Eq. (9.28), we can write the residual fringe phase of one maser component 
at frequency v as 


Ag(v) = 2x [vAt,(v) + (v — v0) te + Vta] + Gin + 270 , (12.54) 


where Te is the relative delay error due to clock offsets; Tat is the differential 
atmospheric delay; At,(v) is the difference between the true geometric delay of 
the source t,(v) and the expected (reference) delay; vio is the local oscillator fre- 
quency; ¢;, is the instrumental phase, which includes the local oscillator frequency 
difference and can be a rapidly varying function of time; and 27n represents the 
phase ambiguity. A frequency can usually be found that has only one unresolved 
maser component, and this component can then be used as a phase reference. The 
use of a phase reference feature is fundamental to all maser analysis procedures, 
and it allows maps of the relative positions of maser components to be made with 
high accuracy. The difference in residual fringe phase between a maser feature at 
frequency v and the reference feature at frequency va is 


A’o(v) = Ag (v) — Ap Onr) , (12.55) 
which, with the use of Eq. (12.54), becomes 
4p) = 2z fo [t.(v) — Te(vp) | 
+ (v= vp) [te(ve) — H(ve)] + V- ve) [te + ta], 0256 


where T (VR) is the expected delay of the reference feature, and Tg(vg) is the true 
delay. The frequency-independent terms in and 27n cancel in Eq. (12.56). How- 
ever, there are residual terms in Eq. (12.56) that are proportional to the difference in 
frequency between the feature of interest and the reference feature. These terms arise 
because phases at different frequencies are differenced in Eq. (12.55). Following the 


12.7 Imaging Astronomical Masers 633 


notation of Eq. (12.7), which uses the convention A term = (assumed value) — (true 
value), we can write Eq. (12.56) as 


2 2 
A2o(v) =D. As — AD. Asvr 


C 


— ŽZ [(y — vg)(AD+ sre + D+ Asg)] + 2 (V — Ve) (te + ta) 
(12.57) 


where D is the assumed baseline, AD is the baseline error, sz is the assumed position 
of the reference feature, and As z is the corresponding position error. As,r is the 
separation vector from the feature at frequency v to the reference feature, and thus 
the true position of the feature at frequency v is Sg — Asr + Asp. 

The first term on the right side of Eq. (12.57) is the desired quantity from which 
the position of the feature relative to the reference feature can be determined, and 
the remaining terms describe the phase errors introduced by uncertainty in baseline, 
source position, clock offset, and atmospheric delay. These phase error terms can 
be converted approximately to angular errors by dividing them by c/2mvD. Thus, 
for example, an error of 0.3 m in a baseline component would cause a delay error 
of about 1 ns in the term AD - sg in Eq. (12.57) and a phase error of 107° turns for 
features separated by 1 MHz. This phase error corresponds to a nominal error of 
10~° arcsec on a baseline of 2500 km at 22 GHz, which provides a fringe spacing of 
107° arcsec. Similarly, a clock or atmospheric error of 1 ns would cause the same 
positional error. The same baseline error also causes additional positional errors, 
through the AD- As, term, of 1077 arcsec per arcsecond separation of the features. 
A detailed discussion of mapping errors caused by this calibration method can be 
found in Genzel et al. (1981). 

Another method of calibrating the fringe phase is to scale the phase of the 
reference feature to the frequency of the feature to be calibrated. That is, 


A?$(v) = Apv) — Ag (vn) . (12.58) 


This method of calibration is more accurate than the method of Eq. (12.55) because 
error terms proportional to v — vg do not appear. However, there are additional terms 
involving the phase ambiguity and the instrumental phase. Thus, this calibration 
method is applicable only if the fringe phase can be followed carefully enough to 
avoid the introduction of phase ambiguities. 

Maps of lower accuracy and sensitivity than those obtainable from phase 
data can be made with fringe-frequency data. Suppose that the interferometer is 
well calibrated. The differential fringe frequency, that is, the difference in fringe 
frequency between the feature at frequency v and the reference feature, can then be 
written [using Eq. (12.20)] 


Avv) > ùAa' (v) + ùA8(v) , (12.59) 
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where ù and ù are the time derivatives of the projected baseline components, Aa’ (v) 
and Aé(v) are the coordinate offsets from the reference feature, and Aa’(v) = 
Aa(v) cos ô. The relative positions of the maser feature can then be found by fitting 
Eq. (12.59) to a series of fringe-frequency measurements at various hour angles. 
This technique was first employed by Moran et al. (1968) for the mapping of 
an OH maser. The errors in fringe-frequency measurements decrease as t7/? [see 
Eq. (A12.27)], where t is the length of an observation, but for large values of T, 
the differential fringe frequency A*v, is not constant, because it and ï are not zero. 
Thus, there is a limited field of view available for accurate mapping with fringe- 
frequency measurements. This field of view can be estimated by equating the rms 
fringe-frequency error in Eq. (A12.27) with t times the derivative of the differential 
fringe frequency with respect to time. Therefore, for an east-west baseline, 


[3 (ty. 1 
D10240 0x | = |, 12.60 
40, APT cos = (2) a ( ) 


where A@ is the field of view. For y27?/3 cos @ ~ 1, the field of view is 


T. 
AO ~ : 


~~ ——____. (12.61) 
Dy Tyw2t? V Avt 


or 


1 


A0 ~x —, 
R snDy@? t? 


(12.62) 


where Rsn is the signal-to-noise ratio. Let Rs, = 10 and t = 100 s. The field 
of view is then about equal to 2000 times the fringe spacing. This restriction is 
often important. Usually when a feature is found, the phase center of the field is 
moved to the estimated position of the feature, and the position is then redetermined. 
Only components that are detected in individual observations on each baseline can 
be mapped with the fringe-frequency mapping technique. Thus, fringe-frequency 
mapping is less sensitive than synthesis mapping, in which fully coherent sensitivity 
is achieved. 

The fringe-frequency analysis procedure can be extended to handle the case 
in which there are many point components in one frequency channel. From each 
observation (i.e., a measurement on one baseline lasting for a few minutes), 
the fringe-frequency spectrum is calculated. Multiple components will appear as 
distinct fringe-frequency features, as shown in Fig. 12.10. The fringe frequency of 
each feature defines a line in (Aa’, Ad) space on which a maser component lies. 
The slope of the line is tan™! (ù/ù). As the projected baseline changes, the slopes 
of the lines change. The intersections of the lines define the source positions (see 
Fig. 12.10). For this method to work, the components must be sufficiently separate 
to produce separate peaks in the fringe-frequency spectrum. The fringe-frequency 
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Fig. 12.10 Plot (b) is the fringe-frequency spectrum of the water vapor maser in W49N, at one 
particular hour angle and one frequency in the radio spectrum of the maser. The ordinate is flux 
density. There are four peaks, each corresponding to a separate feature on the sky. Plot (a) shows 
such lines from many scans. The peaks in the lower plot and their corresponding lines in the upper 
plot are labeled A-D. There are at least four separate features at the frequency of these data. Their 
positions are marked by the locations where many lines intersect. The feature corresponding to 
line D is sufficiently far from the phase center that its fringe frequency changes enough during 
the 20-min integration to degrade significantly the estimate of the feature position. The window 
in which accurate positions can be determined is 0.5” in right ascension and 2” in declination. 
The window can be moved by shifting the phase center of the data. From Walker (1981). © AAS. 


Reproduced with permission. 
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resolution is about t~!, which defines an effective beam of width 


1 


Ady = —————. . 
f Dwet cos 0 


(12.63) 


Fringe-frequency mapping is discussed in detail, for example, by Walker (1981). It 
remains a useful technique for arrays that involve instruments such as RadioAstron. 


Appendix 12.1 Least-Mean-Squares Analysis 


The principles of least-mean-squares analysis play a fundamental role in astrom- 
etry, where the goal is to extract a number of parameters from a set of noisy 
measurements. We briefly discuss these principles in an elementary way, ignoring 
mathematical subtleties, and apply them to the problems encountered in interfer- 
ometry. Detailed discussions of the statistical analysis of data can be found in 
books such as Bevington and Robinson (1992) and Hamilton (1964). The exhaustive 
treatment of how to fit a straight line, by Hogg et al. (2010), is highly recommended. 


Al2.1.1 Linear Case 


Suppose we wish to measure a quantity m. We make a set of measurements y; that 
are the sum of the desired quantity m and a noise contribution n;: 


y=m+n,, (A12.1) 
where n; is a Gaussian random variable with zero mean and variance ož. The 


probability that the ith measurement will take any specific value of y; is given by 
the probability (density) function 


—(y;—m)? /20? 
pO) = ae” ee. (A12.2) 
JT Oj 


If all the measurements are independent, then the probability that an experiment will 
yield a set of N measurements y1, yo,..., Yy iS 


N 
L=[[poo. (A12.3) 
i=1 


where the [| denotes the product of the p (y;) terms. L, viewed as a function of m, is 
called the likelihood function. The method of maximum likelihood is based on the 
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assumption that the best estimate of m is the one that maximizes L. Maximizing L 
is the same as maximizing In L, where 


N N 

1 1 (yi = m)? 

lnL = ) In —- ) -o (A12.4) 
S yno 2m a 


Since the first summation term on the right side of Eq. (A12.4) is a constant and the 
second summation term is multiplied by —4, the maximization of L is equivalent to 
the minimization of the second summation term in Eq. (A12.4) with respect to m. 
Thus, we wish to minimize the quantity x? given by 


N 2 
iom 
v=) oo (A12.5) 
i=1 i 


In the more general problem discussed later in this appendix, m is replaced 
by a function with one or more parameters describing the system model. With 
this generalization, Eq.(A12.5) becomes the fundamental equation of the method 
of weighted least-mean-squares. In this method, the parameters of the model 
are determined by minimizing the sum of the squared differences between the 
measurements and the model, weighted by the variances of the measurements. The 
quantity 77, which indicates the goodness of fit, is a random variable whose mean 
value equals the number of data points less the number of parameters when the 
model adequately describes the measurements. The method of least-mean-squares, 
appropriate when the noise is a Gaussian random process, is a special case of 
the more general method of maximum likelihood. Gauss invented the method of 
least-mean-squares, perhaps as early as 1795, using arguments similar to those given 
here, for the purpose of estimating the orbital parameters of planets and comets 
(Gauss 1809). The method was independently developed by Legendre in 1806 (Hall 
1970). 

Returning to Eq. (A12.5), we can estimate m by setting the derivative of xy? with 
respect to m equal to zero. The resulting estimate of m, denoted by me, is 


Ji 
>g 


Me 


(A12.6) 


where the summation goes from i = 1 to N. Using Eq. (A12.2), we note that (y;) = 
m and (y?) = m? + o?. Therefore, by calculating the expectation of Eq. (A12.6), it 
is clear that (me) = (y;) = m, and it is easy to show that 


=i 
(m2) = m + bz =) (A12.7) 
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Hence the variance of the estimate of me is 


=i 
On = (m) — (me)? = (x =) (A12.8) 


Equation (A12.8) shows that when poor quality or noisy data are added to better 
data, the value of Om may be reduced only slightly. If the statistical error o; of each 
of the measurements has the same value, o, then Eq. (A12.8) reduces to the well- 
known result 


(A12.9) 


and m, is the average of the measurements. In many instances, o is not known. An 
estimate of o is 


1 
= x Y Qi- m’. (A12.10) 


However, m is not known, only its estimate, me. If me were used in place of m 
in Eq. (A12.10), the value of o2 would be an underestimate of o? because of the 
manner in which me was determined in minimizing y~. The unbiased estimate of o? 
is 


1 
2 X o 2 


It is easy to show by substitution of Eq. (A12.6) into Eq. (A12.11) that (02) = 0°. 
The term N — 1, which is called the number of degrees of freedom, appears in 
Eq. (A12.11) because there are N data points and one free parameter. 

Consider a model described by the function f(x;pi,...,Pn), where x is the 
independent variable, which takes values x;, where i = 1 to N, at the sample points, 
and pı, ... , pn are a set of parameters. We assume that the values of the independent 
variable are exactly known. If the function f correctly models the measurement 
system, the measurement set is given by 


Yi = fis Pi,--- Pn) + Mi , (A12.12) 


where n; represents the measurement error. The general problem is to find the values 
of the parameters for which x?, given by the generalization of Eq. (A12.5), 


2 
r= > Foor (A12.13) 


l 


is a minimum. 


Appendix 12.1 Least-Mean-Squares Analysis 639 
A simple example of this problem is the fitting of a straight line to a data set. Let 
f(x;a,b)=a+ bx, (A12.14) 


where a and b are the parameters to be found. Minimizing x? is accomplished by 
solving the equations 


ax? E = 3 2(y; — a — bxi) 


=0, A12.15 
ða o? ( £ 
and 
ay? 2(yi — a — bxi)xi 
— =- —— = 0. A12.15b 
ðb 2 o? ( ) 
In matrix notation, we have 
Yi 1 Xi ae 
D>] [E> DA 
= 5 ; (A12.16) 
Xiyi Xi Xi 
2 o? 2 oP 3 o] |» 


where we distinguish between the true values of the parameters and their estimates 
by the subscript e. The solution is 


e-HEAEY-ENCR] am 


and 


HEYEB-Caeay o 


I 


where A is the determinant of the square matrix in Eq. (A12.16), given by 


EAEH-EH am 


Estimates of the errors in the parameters a, and be can be calculated from 
Eqs. (A12.17) and (A12.18) and are given by 


1 x2 
Eo 2 2. i 
Oy = (az) — (ae) = A a oF (A12.20) 
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and 


oj, = (be) — (be)? = >} aa (A12.21) 


Note that a, and be are random variables, and in general (a,b.) is not zero, 
so the parameter estimates are correlated. The error estimates in Eqs. (A12.20) 
and (A12.21) include the deleterious effects of the correlation between parameters. 
In this particular example, the correlation can be made equal to zero by adjusting 
the origin of the x axis so that )\(x;/07) = 0. 

The above analysis can be used to estimate the accuracy of measurements of 
fringe frequency and delay made with an interferometer. Fringe frequency, the rate 
of change of fringe phase with time, 


y=, (A12.22) 


can be estimated by fitting a straight line to a sequence of uniformly spaced 
measurements of phase with respect to time. The fringe frequency is proportional 
to the slope of this line. Assume that N measurements of phase ¢;, each having 
the same rms error og, are made at times t;, spaced by interval T, running from 
time —NT/2 to NT/2, such that the total time of the observation is t = NT. 
From Eq. (A12.21) and the above definitions, including Eq. (A12.22), the error in 
the fringe-frequency estimate is 

2 


2 % 


ne A12.23 
| PER i i 
since X` rt; = 0. The term X #7 is approximately given by 
1 t/2 1 3 N 2 
Yez al pizon an (A12.24) 
T Jn T2 12 


t//12 can be thought of as the rms time span of the data. Thus, Eq. (A12.23) 
becomes 


2 a A12.25 
Of E (27)? N12 3 ( .25) 


The expression for og, given in Eq. (6.64) for the case when the source is unresolved 
and there are no processing losses, is 


Ts 


0g = ——__K. , (A12.26) 
TyV 2AvT 
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where Ts is the system temperature, 7, is the antenna temperature due to the source, 
and Av is the bandwidth. Substitution of Eq. (A12.26) into Eq. (A12.25) yields 


3 fie. 4 
-=  \— (|> Hz) . A12.27 
a= yaa (7) yar A 


Note that this result does not depend on the details of the analysis procedure, such 
as the choice of N. Equivalently, one can estimate the fringe frequency by finding 
the peak of the fringe-frequency spectrum, that is, the peak of the Fourier transform 
of e*i, 

The delay is the rate of change of phase with frequency, 


t = ——., (A12.28) 


Thus, the delay can be estimated by finding the slope of a straight line fitted to 
a sequence of phase measurements as a function of frequency. For a single band, 
such data can be obtained from the cross power spectrum, the Fourier transform 
of the cross-correlation function. Assume that N measurements of phase are made 
at frequencies v;, each with a bandwidth Av/N and with an error og. In this 
calculation, only the relative frequencies are important. It is convenient for the 
purpose of analysis to set the zero of the frequency axis such that ` v; = 0. The 
error in delay [from Eqs. (A12.19), (A12.21), and (A12.28)] is 


2 
2 % 


Using a calculation for > v? analogous to the one in Eq. (A12.24), we can write 
Eq. (A12.29) as 


2 
2 120; 


Thus, substitution of Eq. (A12.26) (with an integration time of t and bandwidth 
Av/N) into Eq. (A12.30) yields 


Oo, = ee (=) : ; (A12.31) 


We can define the rms bandwidth as 


1 
Avims = 4/ = a A12.32 
Vrms N yoy; ( ) 
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and obtain from Eqs.(A12.26) and (A12.29) the result quoted in Sect. 9.8 
[Eq. (9.179)], 


(A12.33) 


1 (2) l 
o = >(—)——., 
ET) af Ave 


where ¢ = 7x (768)!/4. (Note that in Sect. 9.8, Og applies to the full bandwidth Av.) 
The expressions for o, in Eqs. (A12.30), (A12.31), and (A12.33) incorporate the 
condition Av,ms = Av//12 and apply to a continuous passband of width Av. 

In bandwidth synthesis, which is described in Sect. 9.8, the measurement system 
consists of N channels of width Av/N, which are not in general contiguous. 
The rms delay error is obtained by substituting Eqs. (A12.26) and (A12.32) into 
Eq. (A12.29), yielding 


1 a 1 
Oo, = — | ——__ A12.34 
~V 8712 (= V AVT AVims ( } 


where Avyms is given by Eq.(A12.32) and Av is the total bandwidth. Av,ms is 
generally equal to about 40% of the total frequency range spanned. 

A general formulation of the linear least-mean-squares solution can be found 
when the model function f is a linear function of the parameters pz, that is, when 


n a 
(EPP) =>, ae (A12.35) 
k=1 OPK 


where n is the number of parameters. For example, the model could be a cubic 
polynomial 


F(X; Po. P1, P2, P3) = Po + pix + pox’ + pX , (A12.36) 


in which case df/dp, = x* fork = 0, 1, 2, and 3. If the parameters appear as 
linear multiplicative factors, then the minimization of Eq. (A12.13) leads to a set of 
n equations of the form 


ðy2 
X o 


— k=1,2,...,n. (A12.37) 
OPK 


Substitution of Eq. (A12.13) into Eq. (A12.37) and use of Eq. (A12.35) yield the set 
of n equations 


Dy = È Tapi, k=1,2,...,n, (A12.38) 
j=l 
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where 


N 


_ yo i Fi) 
D; = È 2 ap (A12.39) 


and 


(A12.40) 


and the summations are carried out over the set of N independent measurements. In 
matrix notation, the equation set (A12.38) is 


[D] = [T][Pe] , (A12.41) 


where [D] is a column matrix with elements Dx, [Pe] is a column matrix containing 
the estimates of the parameters peg, and [T] is a symmetric square matrix with 
elements 7}. For obvious reasons, [T] is sometimes called the matrix of the normal 
equations. Note that Eq. (A12.41) is a generalization of Eq. (A12.16). The matrices 
[T] and [D] are sometimes written as the product of other matrices (Hamilton 1964, 
Ch. 4). Let [M] be the variance matrix (size N x N) whose diagonal elements are 
o? and whose off-diagonal elements are zero; let [F] be a column matrix containing 
the data y;; and let [A] be the partial derivative matrix (size n x N) whose elements 
are ðf (x;)/ðpr. Then one can write [T] = [A] [M]! [A] and [D] = [A]F [M]! [F], 
where [A]” is the transpose of [A] and [M]! is the inverse of [M]. The analysis can 
be generalized to include the situation in which the errors between measurements 
are correlated. In this case, [M] is modified to include off-diagonal elements 0;0; Pij 
where Pij is the correlation coefficient for the ith and jth measurements. 
The solution to Eq. (A12.41) is 


[Pe] = 1 [D] , (A12.42) 


where [T]~! is the inverse matrix of [T], and [Pe] is the column matrix containing 
the parameter estimates. The elements of [T]~! are denoted Tix. It can be shown 
by direct calculation that the estimates of the errors of the parameters oĉ are the 
diagonal elements of [T]~!, which is called the covariance matrix. Thus, 


o2 = T. (A12.43) 


The probability that parameter p will be within +0% of its true value is 0.68, which 
is the integral under the one-dimensional Gaussian probability distribution between 
+0. The probability that all of the n parameters will be within +ø of their true 
values (i.e., within the error “box” in the n-dimensional space) is approximately 
0.68” when the correlations are moderate. 
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Fig. A12.1 The error ellipse, 

or contour, defining the e~! 

level of the joint probability 
function [Eq. (A12.45)] for 

the estimates of parameters pp 
and pj. The quantities peg — px 
and pej — pj are the parameter 
estimates minus their true 

values. The angle yj, is 

defined by Eq. (A12.46). - 


Pak -Pk 


The normalized correlation coefficients between parameters are proportional to 
the off-diagonal elements of [T]~!: 


ae T! 
i= (Po -PPa =P)) __ Tr , (A12.44) 


Oek Dej Yl 
D y TjTik 


For any two parameters, there is a bivariate Gaussian probability distribution that 
describes the distribution of errors 


’ 


1 1 € e Dppeg 
OEE k Z 
270;jOk,j 1 — ph (1 — pi) 


o o G% 
where eg = pek — pk and €; = pej — pj. The contour of p (ex, €j) = p(0, 0)e™!/2 
defines an ellipse, shown in Fig. A12.1, which is known as the error ellipse. The 
probability that both parameters will lie within the error ellipse is the integral of 
Eq. (A12.45) over the area of the error ellipse, which equals 0.46. The orientation 
of the error ellipse is given by 


(A12.45) 


1 _; f 2PjikO;jOk 
Yr = = tan! | =]. (A12.46) 
2 o2 — o$ 


j 
The errors in the parameters p are completely determined by the matrix [T]! 
through Eqs. (A12.43)-(A12.45). The elements of [7]~! depend only on the partial 
derivatives of the model function and the values of the measurement errors, which 
can usually be predicted in advance from the characteristics of the measurement 
apparatus. Therefore, once an experiment is planned, the errors in the parameters 
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can be predicted from [T]! without reference to the data. For this reason, [T] 
is sometimes called the design matrix. Studies of the design matrix for a specific 
experiment might reveal a very high correlation between two parameters, leading to 
large errors in their estimated values. It is often possible to modify the experiment 
to obtain more data that will reduce the correlation. After the data are analyzed, 
the value of y? can be computed. If the model is a good fit to the data, y? should 
be approximately equal to N — n, the number of measurements minus the number 
of parameters. If it is not, the difficulty is often that the values of o; are estimated 
incorrectly or that the model does not describe adequately the measurement system, 
that is, the model has too few parameters or is not correct. Even if y7 ~ N —n, 
the derived errors in Eq.(A12.43) may not be realistic, and they are referred 
to as “formal errors.” The formal errors describe the precision of the parameter 
estimates.The accuracy of the parameter measurements is the deviation between 
the estimates of the parameters and the true values of the parameters. The accuracy 
of the measurements is often difficult to determine. For example, an unknown effect 
that closely mimics the functional dependence of one of the model parameters may 
be present in an experiment. The model may appear to be a good one, but the 
accuracy of the particular model parameter in question will be much poorer than 
expected because of the systematic error introduced by the unmodeled effect. 

We can envision how the principles of least-mean-squares analysis are applied 
to a large astrometric experiment. Consider a hypothetical VLBI experiment made 
on a three-station array. Suppose that ten recordings are made of each of 20 
sources during observations made over one day (an epoch). The observations are 
repeated six times a year for five years. The data set would consist of 18,000 
measurements (20 sources x 10 observations x 3 baselines x 30 epochs) of delay 
and fringe frequency, or 36,000 total measurements. The measurements of delay and 
fringe frequency can be combined in the analysis since, in the least-mean-squares 
method, the relevant quantities are the squares of the measurements divided by 
their variances, which are dimensionless, as in Eq. (A12.13). Now we can count 
the number of parameters in the analysis model: 39 source coordinates (1 right 
ascension fixed), 9 station coordinates, 90 atmospheric parameters (a zenith excess 
path length at each station at each epoch), 120 clock parameters (a clock error and 
clock rate error at two of the stations per epoch), and 90 polar motion and UT1—UTC 
parameters, as well as several other parameters to model precession, nutation, solid- 
Earth tides, gravitational deflection by the Sun, movement of stations, and other 
effects such as antenna axis offsets (see Sect. 4.6.1). The total number of parameters 
is about 360. The parameters within each observation epoch are linked because 
of the common clock and atmosphere parameters. Parameters among epochs are 
linked because of baseline, precession, and nutation parameters. Naturally, partial 
solutions from subsets of the data should be obtained before a grand global solution 
is attempted. Procedures are available for obtaining global solutions that do not 
require the inversion of matrices as large as the total number of parameters [see, 
e.g., Morrison (1969)]. Experiments of the scale described here, and larger ones, 
have been carried out [e.g., Fanselow et al. (1984), Herring et al. (1985), and Ma 
et al. (1998)]. 
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A12.1.2 Nonlinear Case 


The discussion of linear least-mean-squares analysis can be generalized to include 
nonlinear functions in a straightforward manner. Assume that f(x;p) has one 
nonlinear parameter p,. For the purpose of discussion, we can separate f into 
linear and nonlinear parts, f,(x;p1,...,Pn—1) and fy.(x; pn), and approximate the 
nonlinear function by the first two terms in a Taylor expansion 


a 
INL; Pn) S fu (X, Pon) + EAD, ; (A12.47) 


where pon is the initial guess of parameter p, and Apn = Pn — Pon. We assume that 
the initial parameter guesses are accurate enough for Eq. (A12.47) to be valid. We 
replace the data with y; — fnr (Xi; Pon) and then compute the elements of the matrices 
[D] and [T] from the partial derivatives, including dfy_/0pn. The nth parameter in 
the matrix [P.] in Eq. (A12.42) will be the differential parameter Ap, defined in 
Eq. (A12.47). The solution must be iterated with a new Taylor expansion centered 
on the parameter po, + Apn. Thus, nonlinear functions can be accommodated in the 
analysis through linearization, but initial guesses of the nonlinear parameters and 
solution iteration are required. In some cases, nonlinear estimation problems can 
cause difficulties [see, e.g., Lampton et al. (1976), Press et al. (1992)]. Recently, 
the use of the Markov chain Monte Carlo (MCMC) method has become almost 
universal (Sivia and Skilling 2006). 


Al2.1.3 (u,v) vs. Image Plane Fitting 


One final topic concerns the estimation of the coordinates of a radio source 
with a well-calibrated interferometer, which has accurately known baselines and 
instrumental phases. In this case, the differential interferometer phase is, from 
Eq. (12.2), 


Ag = 2r D; {[sin d cos 6 — cos d sin ô cos(H — h)] Ad 
+ cos d cos ô sin(H — h) Aa} . (A12.48) 


Expressing the geometric quantities in terms of projected baseline components, we 
can write Eq. (A 12.48) as 


Ag = 27 (uAa' + vAô) , (A12.49) 
where Aa’ = Aacos6. A set of phase measurements from one or more baselines 


can be analyzed by the method of least-mean-squares to determine Aq’ and Aé. 
The partial derivatives are df /0p; = 27u and Of /ðp2 = 27, where pı = Aa’ and 
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p2 = A6é. From Eqs. (A12.40) and (A12.49), the normal-equation matrix is 


2 i . 
He l Le 2 A , (A12.50) 
Op > UiUi > v; 


where all the measurements are assumed to have the same uncertainty og given by 
Eq. (A12.26). The inverse of [T] is 


2 = Ds 
mal Ee ann 


where A is the determinant of the matrix in Eq. (A12.50), 


An? 2 
A= ar bs yw = (do uv) | : (A12.52) 
The correlation coefficient defined by Eq. (A 12.44) is 


= UjVj 
oy = it 
(Sexe 


The variances of the estimates of the parameters are given by the diagonal elements 
of Eq. (A12.51), 


(A12.53) 


o2 y` v? 
o% = — i , (A12.54) 
4r? [£ VEe- mv | 
and 
o2 y` u? 
of oui (A12.55) 


© 4x2 [DF De Euo] 


If the (u, v) loci are long (that is, the observations extend over a large fraction of the 
day), then J u;v; will be small compared to }* u? and ` v? so that 


Og 


2m} u? 


ow ~ (A12.56) 
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and 


Og 
Ia af 5 v? 


Furthermore, if only one baseline is used on a high-declination source, then u; ~ 
vi ~ Dj, and both errors reduce to the intuitive result 


o ~ (A12.57) 


Og 


—. (A12.58) 
2nJ/NDy 


Ow X 0g © 


Alternately, the source position can be found by Fourier transformation of the 
visibility data. This procedure can be thought of as image plane fitting or as 
multiplying the visibility data by the exponential factors exp[2z (uj Aa’ +v;A8)] and 
summing over the data. The resulting “function” is maximized with respect to Aa’ 
and 4ô. In this latter view, it is easy to understand that (basic) image plane fitting 
(that is, no tapering or gridding of the data) is a maximum-likelihood procedure 
for finding the position of a point source and therefore formally equivalent to the 
method of least-mean-squares. The synthesized beam bo for N measurements is 


1 
bo(Aa’, A8) = n X cos [27 (uiAa’ + v;A8)] . (A12.59) 


The shape of bo near its peak can be found by expanding Eq. (A12.59) to second 
order: 


P) 2 
bo(Aa’, Ab) ~ 1— = (ae? XO + A8 Y vu? -2A ASD) uv; 
(A12.60) 


From Eq. (A12.60), it is easy to see that the contours of the synthesized beam are 
proportional to the error ellipse defined by Eqs. (A12.45), (A12.46), and (A12.53)— 
(A12.55). Note that the method of least-mean-squares can be applied only in the 
regime of high signal-to-noise ratio, where phase ambiguities can be resolved. 
However, the Fourier synthesis method can be applied in any case. 


Appendix 12.2 Second-Order Effects in Phase Referencing 


We present a more general analysis of how an error in the position of a calibrator 
source affects the determination of the position of a target source. Suppose the 
phase of an interferometer is referenced to its tracking center (corresponding to 6, 
in Eq. (12.33). If the calibrator has coordinate errors xe = Aq, cos 6, and ye = Aé,, 
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then the residual phase is 

Age, = 20 (UcXe + Vye) . (A12.61) 
This causes a shift in phase at the position of the target of 

Age, = 20 (UiXe + UV) (A12.62) 


Since the (u, v) coordinates are slightly different, there will be a second-order phase 
shift of A?¢@ = Apa — Apa: 


A*d = 2r [(u; — Uc)Xe + (vi — V2) Ve] 
= 27 (Aux, + Avye) . (A12.63) 


This leads to the same approximation given in Eq. (12.39). A complete expression 
for Eq. (A12.63) can be derived by calculating the differential quantities Au and Av 
from Eq. (4.3). 
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Chapter 13 
Propagation Effects: Neutral Medium 


The neutral gas in the atmosphere has a significant effect on signals passing through 
it. We are concerned with three types of effects. First, the large-scale structures in the 
media give rise to refractive effects. These effects, which can be analyzed in terms of 
geometrical optics and Fermat’s principle, are the deflection of the radio waves and 
the change of the propagation velocity. Second, radiation can be absorbed. Finally, 
radiation can be scattered by the turbulent structure of the media. The phenomenon 
of scattering results in scintillation, or seeing. 

In the troposphere, water vapor plays a particularly important role in radio 
propagation. The refractivity of water vapor is about 20 times greater in the 
radio range than in the near-infrared or optical regimes. The phase fluctuations 
in radio interferometers at centimeter, millimeter, and submillimeter wavelengths 
are caused predominantly by fluctuations in the distribution of water vapor. Water 
vapor is poorly mixed in the troposphere, and the total column density of water 
vapor cannot be accurately sensed from surface meteorological measurements. 
Uncertainties in the water vapor content are a serious limitation to the accuracy of 
VLBI measurements. Small-scale (< 1 km) fluctuations in water vapor distribution 
limit the angular resolution of connected-element interferometers in the absence 
of wavefront correction techniques. Furthermore, spectral lines of water vapor 
cause substantial absorption at frequencies above 100 GHz and usually render the 
troposphere highly opaque at frequencies between 1 and 10 THz (300 and 30 um). 
Thus, any discussion of the neutral atmosphere must be primarily concerned with 
the effects of water vapor. Propagation in the neutral atmosphere from the point of 
view of radio communications is discussed by Crane (1981) and Bohlander et al. 
(1985). 

Our interest in the propagation media arises because the media degrade interfer- 
ometric measurements of radio sources. Alternately, observations of radio sources 
can be used to probe the characteristics of the propagation media. Radio interfero- 
metric measurements have been used widely for this purpose. 
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Fig. 13.1 Vertical profiles of H,O and O, volume mixing ratio (ppm) 
temperature (solid line) and 103 102 10! 10° 10) 402 103 10 
the water vapor (H20) : 
(dashed line) and ozone (O3) 
(dotted line) volume-mixing 
ratios, averaged over northern 
and southern midlatitudes for l 
the period 2005-2014, 
compiled from the NASA 
Program for Modern-Era 
Retrospective Analysis for 
Research and Application 
(MERRA) reanalysis 
(Rienecker et al. 2011). The 
averaging captures diurnal 
and annual variations. 100 
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13.1 Theory 


A temperature profile of the atmosphere is shown in Fig. 13.1. In the lowest part 
of the atmosphere, the temperature decreases monotonically from the surface at 
a rate of about 6.5 K km7!, except for an occasional low-level inversion, until it 
reaches about 210K at an altitude of approximately 12km at midlatitudes. This 
lowermost layer is called the troposphere. Above 12 km, the temperature is relatively 
constant for a distance of about 10 km in the region called the tropopause. Above 
the tropopause, the temperature begins to rise with altitude in the stratosphere, due 
to the presence of ozone, reaching about 260 K at 45 km altitude. Above this level, 
the temperature drops with altitude through the mesosphere before rising again in 
the upper atmosphere, where the neutral atmosphere gives way to the ionosphere. 
Within the neutral atmosphere, the propagation of radio waves is most affected by 
the troposphere. Before discussing the refraction, absorption, and scattering of radio 
waves in the troposphere in detail, we introduce some basic physical concepts. 


13.1.1 Basic Physics 


Consider a plane wave propagating along the y direction in a uniform dissipative 
dielectric medium, as represented by the equation 


E(y, t) = Egel) , (13.1) 
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where k is the propagation constant in free space and is equal to 27v/c, c is the 
velocity of light, and Ep is the electric field amplitude. n is the complex index 
of refraction, equal to ng + jm. If the imaginary part of the index of refraction 
is positive, the wave will decay exponentially. The power absorption coefficient is 
defined as 


4rv 
a= —n,. (13.2) 
c 


Its units are m~!. The propagation constant in the atmosphere is k multiplied by the 
real part of the index of refraction, which can be written 


27mnv 2nv 
kng = = —— 5 (13.3) 
c Up 


where n = np is the index of refraction when absorption is neglected, and v, is the 
phase velocity. The phase velocity of the wave, c/n, is less than c by about 0.03% 
in the lower atmosphere. The extra time required to traverse a medium with index 
of refraction n(y) compared with the time necessary to traverse the same distance in 
free space is 


At = fa- l)dy, (13.4) 


where we assume that the effect of the difference in physical length between the 
actual ray path and the straight-line path is negligible. The excess path length is 
defined as cAf, or 


L= 10-* f Noydy. (13.5) 


where we have introduced the refractivity N, defined by N = 10°(n — 1). Note that 
the concept of excess path length, which is used extensively in this chapter, does not 
represent an actual physical path. 

A widely accepted expression for the radio refractivity is (Riieger 2002) 


Pv 


oe (13.6) 


N= 77.68987 + 71.2952 + 375463 


where T is the temperature in kelvins, pp is the partial pressure of the dry air, and 
Py is the partial pressure of water vapor in millibars (1 mb = 100 newtons per 
square meter = 100 pascals = 1 hectopascal; 1 atmosphere = 1013 mb). The first 
two terms on the right side of Eq. (13.6) arise from the displacement polarizations 
of the gaseous constituents of the air (N2, O2, CO2, and H20). The third term is 
due to the permanent dipole moment of water vapor. Equation (13.6) is formally 
known as the “zero-frequency” limit for the refractivity but is accurate to better than 
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1% for frequencies below 100 GHz. The contributions of dispersive components of 
refractivity associated with resonances below 100 GHz are very small. Between 100 
and 1,000 GHz, the deviations from unity of the refractivity are more significant (see 
discussion in Sect. 13.1.4). 

The refractivity can be expressed in terms of gas density, using the ideal gas law, 


pRT 
= ae (13.7) 
where p and p are the partial pressure and density of any constituent gas; R is the 
universal gas constant, equal to 8.314J mol~! K~!; and M is the molecular weight, 
which for dry air in the troposphere is Mp = 28.96 g mol! and for water vapor 
is My = 18.02g mol!. Thus, pp = ppRT/Mp and py = pyRT/My, where 
Pp and py are the densities of dry air and water vapor, respectively. Since the total 
pressure P is the sum of the partial pressures, and the total density pr is the sum of 
the constituent densities, Eq. (13.7) can be written P = prRT/ Mr, where 


1 PD 1 aa 
Mr = | —-—+———] . 13.8 
j Gone My pr ( ) 


Substitution of the appropriate forms of Eq. (13.7) and the equation pp = pr — pv 
into Eq. (13.6) yields 


N = 0.2228pr + 0.076py + 1742% , (13.9) 


where pr and py are in g m™°. Since the second term on the right side of Eq. (13.9) 
is small with respect to the third term, it can be combined with the third term to give, 
for T = 280K, 


N ~ 0.2228pr + 1763X =Np+Nv. (13.10) 


Equation (13.10) defines the dry and wet refractivities, Np and Ny, respectively. 
These definitions are not universally followed in the literature. Note that Np is 
proportional to the total density and therefore has a contribution due to the induced 
dipole moment of water vapor. Mean values of the distribution of the column 
density of water vapor around the world are shown in Fig. 13.2. For a discussion 
of climatology of water vapor, see Peixoto and Oort (1996). 

The atmosphere is in hydrostatic equilibrium to a high degree of accuracy 
(Andrews 2000). A parcel of gas in static equilibrium between pressure and gravity 
obeys the equation 


— = —prg , (13.11) 
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Fig. 13.2 Worldwide distribution of total water vapor content (w) based on satellite and ground- 
based observations over the ten-year period 2005-2014 in the framework of a global atmospheric 
model. The color scale denotes the column density in units of kg m~? (equivalent to millimeters 
of precipitable water). Note that the resolution is not sufficient to show small localized areas of 
low water vapor, such as Mauna Kea. Data from the NASA MERRA Program. See Rienecker et al. 
(2011). 


where g is the acceleration due to gravity, approximately equal to 980 cm s~*, and 


h is the height above the Earth’s surface. Using the ideal gas law, Eq. (13.7), we 
can integrate Eq. (13.11), assuming specific forms for the temperature profile and 
mixing ratio. If an isothermal atmosphere with constant mixing ratio is assumed, 
then pr is an exponential function with a scale height of RT/Mg ~ 8.5km for 
290 K, which is close to the observed scale height. Other models are described by 
Hess (1959). The excess path length caused by the dry component of refractivity 
does not depend on the height distribution of total density or temperature, but 
only on the surface pressure Po, under conditions of hydrostatic equilibrium. If 
g is assumed to be constant with height, the surface pressure can be obtained by 
integrating Eq. (13.11), 


Po = ef pr(h)dh. (13.12) 
0 


From Eqs. (13.5), (13.10), and (13.12), the dry excess path length in the zenith 
direction is 


CO 
Lp = 10-* f Np dh = AP , (13.13) 
0 
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where A = 77.6 R/gMp = 0.228cm mb™!. Under standard conditions for which 
Po = 1013 mb, the value of £p is 231 cm. 

Water vapor is not well mixed in the atmosphere and therefore is not well corre- 
lated with ground-based meteorological parameters (e.g., Reber and Swope 1972). 
On average, water vapor density has an approximately exponential distribution with 
a scale height of 2km. This can be understood in the following way. The partial 
pressure and density of water vapor from Eq. (13.7) are related by 


= 217py 
© T 


pv (gm). (13.14) 
The partial pressure of water vapor for saturated air, pys, at temperature T, obtained 
from the Clausius—Clapeyron equation (Hess 1959), can be approximated to an 


accuracy of better than 1% within the temperature range 240-310 K by the formula 
(Crane 1976) 


T —5.3 
pvs = 6.11 (=) e2T—273)/T (mb) . (13.15) 


The relative humidity is py/pys. This approximation to the Clausius—Clapeyron 
equation is nearly an exponential function of temperature, dropping from 10.0 mb at 
280 K to 3.7 (a factor of e~!) at 266 K. For a lapse rate in temperature of 6 K km~!, 
the profile of water vapor density is very close to an exponential function with a 
scale height of 2.5 km. For the purpose of this discussion, we adopt a simple model 
for the wet atmosphere as being isothermal with a scale height of 2.0 km, as is often 
observed. 

The component of the path length resulting primarily from the permanent dipole 
moment of water vapor is, from Eq. (13.10), 


© py(h) 
TO) dh, (13.16) 


Ly = 1763 x 10-* f 
0 


where the units of Ly are the same as those of h. Hence, for the approximation 
above, we obtain 


Pvo 


Ly = 350 = (cm) (13.17a) 


or 


Ly =76x 102 (cm) , (13.17b) 
where pyo and pyo are the density and partial pressure of water vapor at the surface 
of the Earth, respectively. Hence, for T = 280 K, the path length is given by Ly = 
1.26pvo = 0.97pyo. 
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The integrated water vapor density, or the height of the column of water 
condensed from the atmosphere, is given by 


1 o0 
w= ~f ov(h)dh, (13.18) 
Pw Jo 


where p, is the density of water, 10° g m~’. Hence, from Eq. (13.16), for an 
isothermal atmosphere at 280 K, 


Ly ~ 6.3w. (13.19) 


This formula, which is widely used in the literature, is an excellent approximation 
for frequencies below 100 GHz. In the windows above 100 GHz, the ratio Ly /w can 
vary from 6.3 to about 8 (see Fig. 13.9 and associated discussion). The values of Ly 
under extreme conditions for a temperate, sea-level site can be calculated from the 
equations above. With T = 303K (30°C) and relative humidity = 0.8, we have 
Pvo = 34 mb, pvo = 24 g m3, w = 4.9cm, and Ly = 28cm. With T = 258K 
(—15°C) and relative humidity = 0.5, we have pyo = 1.0mb, pyo = 0.8g m~’, 
w = 0.15cm, and Ly = 1.1 cm. The total zenith excess path length through the 
atmosphere is L ~ Lp + Ly, which, from Eqs. (13.13) and (13.19), is 


L ~ 0.228Py + 6.3w (cm), (13.20) 


where Po is in millibars, and w is in centimeters. Equation (13.20) is reasonably 
accurate for estimation purposes because the fractional variation in the temperature 
of the lower atmosphere, and in the scale height of water vapor, is usually less than 
10%. However, it is usually not accurate enough to predict the path length to a small 
fraction of a wavelength at millimeter wavelengths. 


13.1.2 Refraction and Propagation Delay 


If the vertical distributions of temperature and water vapor pressure are known, 
then precise estimates of the angle of arrival and excess propagation time for a ray 
impinging on the atmosphere at an arbitrary angle can be computed by ray tracing. 
Here, we consider a few elementary cases in order to derive some simple analytic 
expressions. The simplest case is that of an interferometer in a uniform or plane- 
parallel atmosphere, as shown in Fig. 13.3. The refraction of the ray is governed by 
Snell’s law, which is 


no Sin Zp = SIN Z , (13.21) 


where z is the zenith angle at the top of the atmosphere (where n = 1), and zo 
is the zenith angle at the surface (where n = nọ). The geometric delay for an 


664 13 Propagation Effects: Neutral Medium 


Pa TO SOURCE 


Zz 


Fig. 13.3 Two-element interferometer with the atmosphere modeled as a uniform flat slab. The 
geometric delay is the same as it would be if the interferometer were in free space. 


p /7/ 


interferometer, as defined in Chap. 2, is 


noD . D i 
Tg = — sin zo = — sinz. (13.22) 
c c 


Tg can be calculated from the angle of arrival zo and the velocity of light at the 
Earth’s surface c/no, or from z and the velocity of light in free space. Thus, if Earth 
curvature is neglected and the atmosphere is uniform, the resulting geometric delay 
is the same as the free-space value. The angle of refraction need only be calculated 
to ensure that the antennas track the source properly. The angle of refraction, Az = 
Z— Zo, can be written, using Eq. (13.21), as 


1 
Az =z- sin! (— sin :) . (13.23) 
no 
This equation can be expanded in a Taylor series in nọ — 1, which to first order gives 
Az ~ (no — 1) tanz. (13.24) 

Since no — 1 ~ 3 x 1074 at the surface of the Earth, Eq. (13.24) can be written 
Az (arcmin) ~ tanz. (13.25) 
The angle of refraction can also be calculated for more realistic cases. Ignore the 
curvature of the Earth, and consider the atmosphere to consist of a large number of 


plane-parallel layers numbered 0 through m, as shown in Fig. 13.4. Let the index of 
refraction at the surface be no, and at the top layer, nm = 1. Applying Snell’s law to 
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SURFACE 


Fig. 13.4 The atmosphere modeled as a set of thin, uniform slabs. The angle of incidence on the 
topmost slab is Zm, which is equal to the free-space zenith angle z, and the angle of incidence at the 
surface is zo. The total bending is Az = z — Zo. 


the various layers gives the following set of equations: 


No Sin zo = ny Sin Z1 


nı Sin zı = M SİN Z2 


Nm—1 SİN Zm—1 = SINZ, (13.26) 


where z = zm. From these equations, we see that nosinzo = sinz. This result 
is identical to that for the homogenous case. Thus, regardless of the vertical 
distribution of the index of refraction, the angle of refraction is given by Eq. (13.21), 
where no is the surface value of the index of refraction. This result can also 
be obtained by an elementary application of Fermat’s principle. An interesting 
application of this result is that if nọ = 1, as would be the case if the measuring 
device were in a vacuum chamber at the surface of the Earth, then there would be 
no net refraction; that is, zo = Z. 
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For an atmosphere consisting of spherical layers, the angle of refraction is given 
by the formula (Smart 1977) 


no 
Az = rono sin J (13.27) 
1 


dn 
n| rn? — rna sin? zo 


where r is the distance from the center of the Earth to the layer where the index of 
refraction is n and rọ is the radius of the Earth. This result is derivable from Snell’s 
law in spherical coordinates: nrsinz = constant (Smart 1977). For small zenith 
angles, expansion of Eq. (13.27) gives 


Az > (no — 1) tan zp — az tan zp sec? zo , (13.28) 
where a is a constant. Equation (13.28) can also be written 
Az ~ ay tan zo — a tan?zo , (13.29) 


where a; ~ 56” and a) ~ 0.07” for a dry atmosphere under standard conditions 
(COESA 1976). The refraction at the horizon is about 0.46° (see Fig. 13.6). See 
Saastamoinen (1972a) for a more detailed treatment. 

The differential delay induced in an interferometer by a horizontally stratified 
troposphere results from the difference in zenith angle of the source at the antennas. 
Consider two closely spaced antennas. If the excess path in the zenith direction 
is Lo, then the excess path in other directions is approximately Ly secz. This 
approximation becomes inaccurate at large zenith angles. The difference in excess 
paths, AL, by first-order expansion, is 


ALA Tyke, (13.30) 
COS” z 


where Az is the difference in zenith angles at the two antennas. 

If the antennas are on the equator and the source has a declination of zero, then 
Az is equal to the difference in longitudes, or approximately D/ro, where D is the 
separation between antennas. For this case, 


a 
AL~ 2A . (13.31) 
ro cos?z 


If D = 10 km, Lo = 230 cm, ro = 6370km, and z = 80°, then AL is 12cm. The 
calculation of the difference in excess paths can be easily generalized as follows. 


Let rı and r be vectors from the center of the Earth to each antenna. The geometric 
delay is (rı +s — r2 +s)/c, where s is the unit vector in the direction of the source. 
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Since cosz; = (r,+s)/ro and cosz = (12°8)/ro, where zı and zz are the zenith 
angles at the two antennas, the geometric delay can be written 


ro ro : 
Tg = — (cos zı — cos Z2) ~ —Azsinz. (13.32) 
c c 


Substitution of Az from Eq. (13.32) into Eq. (13.30) yields an expression for the 
difference in excess path lengths, valid for short-baseline interferometers and 
moderate values of zenith angle: 


CTgLo 


ro 


AL~ sec? z. (13.33) 


For very-long-baseline interferometers, the expression in Eq.(13.30) is not 
appropriate. The difference in excess path lengths is approximately AL = 
Lı sec zı —L sec z2, where L1, £2, z1, and z are the excess zenith path lengths and 
the zenith angles at the two antennas. We now derive a more accurate expression for 
the excess path length to each antenna. The geometry is shown in Fig. 13.5. Assume 
the index of refraction to be exponentially distributed with a scale height ho. The 


Fig. 13.5 Geometry for ZENITH TO SOURCE 
calculating the propagation 
delay, taking into account the 
sphericity of the Earth. The 
ray path along the y 
coordinate is assumed to be 
straight. The angle z; is the 
zenith angle of the ray at 
height h. This angle is needed 
in the calculation of the 
excess path length through 
the ionosphere. 
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excess path length is 


ee h 
L= 105o f exp (-+) dy, (13.34) 
0 


where No is the refractivity at the Earth’s surface, h is the height above the surface, 
and dy is the differential length along the ray path. Bending of the ray is neglected. 
From the geometry of Fig. 13.5, (h+ ro)? = r? +y? + 2roy cos z. Using the quadratic 
formula and the second-order expansion (1 + A)!/? ~ 1 + 4/2 — A?/8, where 
A = (y? + 2yro cos z)/ Fas one can show that 


2 


h ~ ycosz + 2—sin?z. (13.35) 
2ro 
Therefore 
oo y y 
LZ 106m | exp | —— cosz | exp | — sin? z } dy . (13.36) 
0 ho 2roho 


The argument of the rightmost exponential function in Eq. (13.36) is small, and this 
exponential function can be expanded in a Taylor series so that 


0° 2 
LZ 10o f exp T cosz | x {1— 2 sin’ z--- | dy. (13.37) 
0 ho 2roho 


Integration of Eq. (13.37) yields 


h 
L ~ 10°-°Noho sec z (: a2 tan? 2) z (13.38) 
ro 


Equation (13.38) can also be written 


h h 
L ~ 107ÉNoho It ue r) sec z— — sec? | (13.39) 
ro ro 


Thus, £ is a function of odd powers of secz, whereas the bending angle, given 
in Eq. (13.29), is a function of odd powers of tanz. Equations (13.38) and (13.39) 


both diverge as z approaches 90°. For z = 90°, Eq. (13.35) shows that h ~ y*/2rp. 
Hence, for direct integration of Eq. (13.34), the excess path at the horizon is 


6 T roho 
L = 10M y = = T0Lo = 14No (cm) (13.40) 


for ro = 6370km and hp = 2km. 
A model incorporating both the dry atmosphere with a scale height hp = 8 km 
and the wet atmosphere with a scale height hy = 2 km can be obtained by applying 
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Eq. (13.38) to both the dry and wet components using Eqs. (13.13) and (13.17). This 
result is 


L£ ~ 0.228Po sec z (1 — 0.0013 tan? z) 


7.5 x 10*pyo sec z 


a (1 — 0.0003 tan’ z) . (13.41) 


More sophisticated models have been derived by Marini (1972), Saastamoinen 
(1972b), Davis et al. (1985), Niell (1996), and others. A comparison of the 
approximate formula of Eq. (13.41), a simple sec z model, and a ray-tracing solution 
is given in Fig. 13.6. 
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Fig. 13.6 (a) The bending angle vs. 90° — z, where z is the zenith angle that the ray would have 
in the absence of refraction, calculated by a ray-tracing algorithm for a standard dry atmosphere 
(COESA 1976). (b) The excess path length vs. 90° — z calculated by a ray-tracing algorithm. The 
zenith excess path is 2.31 m. (c) Deviation between the excess path length and (1) the Lo sec z 
model and (2) the model of Eq. (13.41); in both cases, pyo = 0, and the zenith excess path is the 
same as in (b). 
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13.1.3 Absorption 


When the sky is clear, the principal sources of atmospheric attenuation are the 
molecular resonances of water vapor, oxygen, and ozone. The resonances of 
water vapor and oxygen are strongly pressure broadened in the troposphere and 
cause attenuation far from the resonance frequencies. A plot of the absorption vs. 
frequency is shown in Fig. 13.7. Below 30GHz, absorption is dominated by the 
weak 616—523 transition of H20 at 22.2 GHz (Liebe 1969). Absorption by this line 
rarely exceeds 20% in the zenith direction. (See Appendix 13.1 for the history of 
research on this line.) 

The oxygen lines in the band 50-70 GHz are considerably stronger, and no 
astronomical observations can be made from the ground in this band. An isolated 
oxygen line at 118 GHz makes observations impossible in the band 116-120 GHz. 
At higher frequencies, there is a series of strong water vapor lines at 183, 325, 380, 
448, 475, 557, 621, 752, 988, and 1097 GHz and higher Liebe (1981). Observations 
can be made in the windows between these lines at dry locations, usually found 
at high altitudes. The physics of atmospheric absorption is discussed in detail by 
Waters (1976), and a model of absorption at frequencies below 1000 GHz is given 
by Liebe (1981, 1985, 1989). We are concerned here only with the phenomenology 
of absorption and its calibration. The absorption coefficient depends on the temper- 
ature, gas density, and total pressure. For example, the absorption coefficient for the 
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Fig. 13.7 Atmospheric zenith opacity. The absorption from narrow ozone lines has been omitted. 
Adapted from Waters (1976). For zenith opacity at frequencies above 300 GHz, see Liebe (1981, 
1989). Note that 2 g cm~? of H20 corresponds to w = 2 cm. 
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22 GHz H20 line can be written (Staelin 1966) 


2p T 
or = (3.24 x 10-4e- 44/7) vane ad (1 +0. 0147) 


1 1 
i E — 22.235)? + Av? i (v + 22.235)? + <= 


+ 2.55 x 10-8pyv? (cm7!) . (13.42) 


as 
Here, Av is approximately the half-width at half-maximum of the line in gigahertz, 
given by the equation 


P 


z pyT 
Av = 2.58 x 1073 | 1 + 0.0147 —— ) —_—____ , 
i i ( ý >) ae 


(13.43) 


where v is the frequency in gigahertz, T is the temperature in kelvins, P is the total 
pressure in millibars, and py is the water vapor density in grams per cubic meter. 
The lineshape specified by Eq. (13.42), the Van Vleck—Weisskopf profile, appears 
to fit the empirical data better than other theoretical profiles (Hill 1986). Other line 
parametrizations of the line profile are available, for example, Pol et al. (1998). 

The intensity of a ray passing through an absorbing medium obeys the radiative 
transfer equation. We assume that the medium is in local thermodynamic equilib- 
rium at temperature T and that scattering is negligible. In the domain where the 
Rayleigh—Jeans approximation to the Planck function is valid, so that the intensity 
is proportional to the brightness temperature, the equation of radiative transfer can 
be written (Rybicki and Lightman 1979) 


— == 7), (13.44) 


where Tz is the brightness temperature and a is the absorption coefficient defined in 
Eqs. (13.2) and (13.42). The solution to Eq. (13.44) for radiation propagating along 
the y axis is 


Tp(v) = Tgo(v)e ™ +f alv, YTO)" dy 5 (13.45) 
0 


where Tpgo is the brightness temperature in the absence of absorption, including the 
cosmic background component, 


j 
T! =f a(v,y’) dy’, (13.46) 
0 
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and 
0O 
T = a(v,y’) dy’. (13.47) 
0 


Here, y is the distance measured from the observer. t, is called the optical depth, 
or opacity. The first term on the right side of Eq. (13.45) describes the absorption 
of the signal, and the second describes the emission contribution of the atmosphere. 
Equation (13.45) illustrates the fundamental law that an absorbing medium must 
also radiate. If T(y) is constant throughout the medium, then Eq. (13.45) can be 
written 


Tev) = Tao(vye-” + T(1—e7”) . (13.48) 


The presence of absorption can have a very significant effect on system perfor- 
mance. If the receiver temperature is Tr, then the system temperature, which is 
the sum of Tg and the atmospheric brightness temperature (the effects of ground 
radiation being neglected), is 


Ts =Tr+Ta(l—e”), (13.49) 


where Tą is the temperature of the atmosphere. In the absence of a source, the 
antenna temperature is taken as equal to the brightness temperature of the sky. 
Furthermore, if the brightness temperature scale is referenced to a point outside 
the atmosphere by multiplying the measurements of brightness temperature [see 
Eq. (13.48)] by e”, then the effective system temperature is Tse", or 


Ty = Tre” + Tale” — 1). (13.50) 


In effect, the atmospheric loss is modeled by an equivalent attenuator at the receiver 
input. Suppose that Tr = 30K, Ta = 290K, and t, = 0.2; then the effective system 
temperature is 100 K. In such a situation, the atmosphere would degrade the system 
sensitivity by a factor of more than three. Note that the loss in sensitivity results 
primarily from the increase in system temperature rather than from the attenuation 
of the signal, which is only 20%. The emission from the atmosphere induces signals 
in spaced antennas that are uncorrelated and thus contributes only to the noise in the 
output of an interferometer. 

The absorption can be estimated directly from measurements made with a radio 
telescope. In one technique introduced by Dicke et al. (1946), called the tipping-scan 
method, the opacity is determined from the atmospheric emission. If the antenna is 
scanned from the zenith to the horizon, the observed brightness temperature, in the 
absence of background sources, will depend on the zenith angle, since the opacity is 
proportional to the path length through the atmosphere, which varies approximately 
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as sec z. Thus, the atmospheric brightness temperature is 
Tp = Ta(l — e ™ 5s), (13.51) 
where to is the zenith opacity. When To secz < 1, 
Tg ~ TatTo secz . (13.52) 


For narrow-beamed antennas, the antenna temperature is equal to the brightness 
temperature. For broad-beamed antennas, the antenna temperature is a zenith angle 
weighted version using Eq. (13.51). The opacity can be found from the slope of 
Tg plotted vs. sec z, assuming that Tą is the surface temperature. The accuracy of 
this method is affected by ground pickup through the sidelobes, which varies as a 
function of zenith angle. 

The opacity can also be estimated from measurements of the absorption suffered 
by a radio source over a range of zenith angles. The observed antenna temperature 
on-source minus the antenna temperature off-source at the same zenith angle to 
remove the emission [see Eq. (13.48)] is 


AT, = Toe O°? , (13.53) 


where Tso is the component of antenna temperature due to the source in the absence 
of the atmosphere. From Eq. (13.53), 


In AT, = ln Tso — To secz . (13.54) 


Thus, tọ can be found without knowledge of T4 if a sufficient range in secz is 
covered. This method is affected by changes in antenna gain as a function of zenith 
angle. 

Another technique, called the chopper-wheel method, is commonly used at 
millimeter wavelengths. A wheel consisting of alternate open and absorbing sections 
is placed in front of the feed horn. As the wheel rotates, the radiometer alternately 
views the sky and the absorbing sections and synchronously measures the difference 
in antenna temperature between the sky and the chopper wheel at temperature Tọ. 
Thus, the on-source and off-source antenna temperatures are 


ATon = Tse + Tall — e ™) — To (13.55) 
and 


AT ot¢ = Tal — e ™)-— To. (13.56) 
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Table 13.1 Empirical coefficients for estimating opacity from 
surface absolute humidity* 


v a Qa 
(GHz) (nepers) (nepers m? g7!) 
15 0.013 0.0009 
22.2 0.026 0.011 
35 0.039 0.0030 
90 0.039 0.0090 

Source: Waters (1976). 

“From the equation t = a + aj /pyo fitted to opacity data 


derived from radiosonde measurements and measurements of 


surface absolute humidity, pyo g m7. 


These measurements can be combined to obtain To and thereby eliminate the effect 


of atmospheric absorption. In the case in which Ty = Tat, 
Ty = (42AT) y (13.57) 
i AT of o> l 


When sensitivity is critical, the chopper wheel is used only to calibrate the output 
in the off-source position. A7To¢ — ATon in the numerator of Eq. (13.57) is then 
replaced by Toff — Ton. Measurement of Tso provides the flux density of the source, 
which determines the visibility at the origin of the (u, v) plane. 

The opacity can be estimated also from surface meteorological measurements 
when other data are not available. This method is not as accurate as the direct 
radiometric measurement techniques described above but has the advantage of 
not expending observing time. Waters (1976) has analyzed data on absorption vs. 
surface water vapor density for a sea-level site at various frequencies by fitting them 
to an equation of the form Tọ = a + a Pyo. The coefficients a» and œ; are listed in 
Table 13.1. 


13.1.4 Origin of Refraction 


For practical reasons, we have discussed separately the effects of the propagation 
delay and the absorption in the neutral atmosphere. However, the delay and the 
absorption are intimately related because they are derived from the real and 
imaginary parts of the dielectric constant of the gas in the atmosphere. The real 
and imaginary parts of the dielectric constant are not independent but are related 
by the Kramers—Kronig relation, which is similar to the mathematical relation 
known as the Hilbert transform (Van Vleck et al. 1951; Toll 1956). We now discuss 
this relationship from the physical viewpoint of the classical theory of dispersion. 
From this analysis, it will become clear why the atmospherically induced delay is 
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essentially independent of frequency, even in the vicinity of spectral lines that cause 
significant absorption. 

A dilute gas of molecules can be modeled as bound oscillators. In each molecule, 
an electron with mass m and charge —e is harmonically bound to the nucleus, and 
the electron’s motion is characterized by a resonance frequency vo and damping 
constant 2xT. The equation of motion with a harmonic driving force —eEye?"”" 
caused by the electric field of an electromagnetic wave can be approximated as 


mł + 2am + 40?mvex = —eEye?™' , (13.58) 


where x is the displacement of the bound electron, Eo and v are the amplitude and 
frequency of the applied electric field, and the dots denote time derivatives. The 
steady-state solution has the form x = xye/?""", where 


eEo/4m?m 


E, 13.59 
v? — v2 + jor i ) 


Xo = 


The magnitude of the dipole moment per unit volume, P, is equal to —nmexo, where 
Nm is the density of gas molecules. The dielectric constant! ¢ is equal to 1+P/(€0E), 
so that 


Nme? /47 Meo 


ZE BBA ay 13.60 
v2 — v + jr ( ) 


e=1- 


This classical model predicts neither the resonance frequency nor the absolute 
amplitude of the oscillation. A full treatment of the problem requires the application 
of quantum mechanics. The proper quantum-mechanical calculation for a system 
with many resonances yields a result that closely resembles Eq. (13.60) [e-g., 
Loudon (1983)]: 


2 : 
g=] ay (13.61) 


An?mey) — v2 —v2.+ jr; ° 


t 


where f; is the so-called oscillator strength of the ith resonance. The f; values obey 
the sum rule, X f; = 1. 


'Tn this section and in Sect. 13.3, we use SI (System International) units, also known as rationalized 
MKS units. In this system, the constitutive relation between the displacement vector D, the electric 
field vector E, and the polarization vector P is D = eọE + P = cE, where €, is the permittivity of 
free space, and € is the permittivity of the medium. The dielectric constant € is €/€ọ. A comparison 
of various systems of units and equations in electricity and magnetism can be found in Jackson 


(1999). 
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The dielectric constant (€ = £R + jé;) and index of refraction (n = ng + jni) are 
connected by Maxwell’s relation, 


n =e. (13.62) 


Thus, €R = na — n? and £y = 2nynę. Since for a dilute gas ng ~ 1 and ny & 1, we 
have ng ~ „/Er and ny ~ €;/2. Therefore, for a gas with a single resonance, 


7 nme (v? — vé) /87 Meo 
(v2 _ vê)? +4 p2T2 


~x 


nR X 


(13.63) 


and 


nme? vT /87? meo 


E 13.64 
(v2 zm va)? 4 p22 ( ) 


n= 


The resonance is usually sharp, that is,  < vo, and the expressions for ng and 
ny can be simplified by considering their behavior in the vicinity of the resonance 
frequency vo, in which case 


2 2 


vf — vo = (v + vo) (v — vo) = 2vo(v — vo) . (13.65) 

Thus 

2b(v — vo) 
SE a a 13.66 
me =" Ww) +T2/4 eee) 
and 
bT 

ny (13.67) 


~ (v— v9)? + 12/4’ 


where b = nme? /3207meqvo. 

Equation (13.67) defines an unnormalized Lorentzian profile for ny that is 
symmetric about frequency vo and has a full width at half-maximum of I and a 
peak amplitude of 4b/T. The function ng — 1 is antisymmetric about frequency 
vo and has extreme values of +2b/T at frequencies vo + T/2, respectively. The 
functions npg and ny are plotted in Fig. 13.8. Note that the peak deviation from unity 
in the real part of the index of refraction, An, is equal to one-half the peak value of 
ny, denoted nmax. Thus, from Eq. (13.2), we see that the peak absorption coefficient, 
Œm = 47NmaxVo/C, is related to An by the formula 


md 
Ae (13.68) 
81 
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Fig. 13.8 Real and imaginary parts of the index of refraction vs. frequency for a single resonance, 
given by Eqs. (13.63) and (13.64). The case shown is for the 616—523 transition in pure water vapor 
with py = 7.5g m~*. In the atmosphere at the standard sea-level pressure of 1013 mb, the line 
is broadened to about 2.6 GHz (Liebe 1969). For the curve ng — 1, the peak deviation is An [see 
Eq. (13.68)], and the change in level passing through the line is ôn [see Eq. (13.69)]. 


where Ao is the wavelength of the resonance, c/vo. The magnitude of the real part 
of the index of refraction is equal to the peak absorption over a distance of 19/87. 
In addition, Eq. (13.66) shows that the real part of the index of refraction is not 
exactly symmetric about vo; that is, mp tends to unity as v tends to oo, and np tends 
to 1 + 2b/vo = 1 + AnT/vo = 1 + (Anam /82)(T/ v9) as v tends to zero. Hence, 
the change én in the asymptotic value of the index of refraction on passing through 
a resonance is given by 


2 
pe ana (13.69) 
87c 


Thus, 6n/An = y/vo, but unless the resonance is extremely strong, An and én 
are both negligible. Consider the 22-GHz water vapor line. The attenuation in the 
atmosphere when py = 7.5g m`? is 0.15dB km7!, so a, = 3.5 x 1077 cm7!. 
Equation (13.68) then predicts that An = 1.9 x 1078, or AN = 0.019, which 
agrees with the value measured in the laboratory (Liebe 1969). For the same value 
of p,, the contribution of all transitions of water vapor to the value of the index of 
refraction at low frequencies (10-°Ny), from Eq. (13.10), is equal to 4.4 x 107°. 
Thus, the fractional change in refractivity near the 22-GHz line is only 1 part in 
2500. The change in asymptotic level is even smaller. At sea level, [ = 2.6 GHz 
and én = 2.2 x 1078. The water vapor line at 557 GHz (the 19-1 transition) 
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Fig. 13.9 The predicted excess path length due to water vapor per unit column density vs. 
frequency, from formulas by Liebe (1989) for T = 270K and P = 750 mb. From Sutton and 
Hueckstaedt 1996, reproduced with permission. © ESO. 


has an absorption coefficient of 29,000 dB km™!, or 0.069cm™!. The values of An 
and ôn are 1.44 x 10~® and 0.7 x 10~°, respectively. In the atmospheric windows 
above 400 GHz, where radio astronomical observations are possible only from very 
dry sites, the refractive index can be noticeably different from the value at lower 
frequencies. The normalized refractivity is shown in Fig. 13.9. 

Equation (13.68) is an important result of very general validity. We derived it 
from a specific model [Eq. (13.58)] that led to an approximately Lorentzian profile 
for the absorption spectrum. In practice, line profiles are found to differ slightly 
from the Lorentzian form, and more sophisticated models are needed to fit them 
exactly. However, Eqs. (13.68) and (13.69) could be derived from the Kramers- 
Kronig relation. 

The low-frequency value of the index of refraction, as given by Eq. (13.9), 
results from the contributions of all transitions at higher frequencies. Summing the 
contributions [see Eq. (13.69)] of many lines, each characterized by parameters Any, 
Tj, @mi, and voj, we obtain the low-frequency value of the index of refraction: 


Omi Az; An; 
=1 a= 1 a 13.70 
ns=1+ 2 e rA (13.70) 


. Voi 
L 


The water vapor molecule has a large number of strong rotational transitions in 
the band from 10 um to 0.3mm (from 30 THz to 1000 GHz). The atmosphere is 
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opaque through most of this region because of these lines, which contribute about 
98% of the low-frequency refractivity. The remainder comes from the 557-GHz line. 

Grischkowsky et al. (2013) show that the full theoretical calculation behind 
Eq. (13.70), based on Van Vleck—Weisskopf line shapes and incorporating all water 
lines from 22.2 GHz through 30 THz, gives agreement with the empirical expression 
for refractivity without any ad hoc corrections. Complete computer codes for the 
atmospheric absorption and refraction have been developed by Pardo et al. (2001a) 
and Paine (2016). 


13.1.5 Radio Refractivity 


A detailed discussion of the radio refractivity equation can be found in the report of a 
working group of the International Association of Geodesy (Rtieger 2002). Previous 
work on combining laboratory measurements includes Bean and Dutton (1966), 
Thayer (1974), Hill et al. (1982), and Bevis et al. (1994). From the classic work 
of Debye (1929), it can be shown that the refractivity of molecules with induced 
dipole transitions varies as pressure and T~!, and the refractivity of molecules with 
permanent dipole moments varies as pressure and 7~*. The principal constituents 
of the atmosphere—oxygen molecules, O2, and nitrogen molecules, N»,—being 
homonuclear, have no permanent electric dipole moments. However, molecules such 
as H,O and other minor trace constituents have permanent dipole moments. Thus, 
the general form of the refractivity equation is 


_ Kipp Kəpv | Kapv 
IZp TZv PZvy’ 


N (13.71) 


where pp and py are the partial pressures of the dry air and water vapor; Kı, K2, 
and K3 are constants; and Zp and Zy are compressibility factors for dry-air gases 
and water vapor, which correct for nonideal gas behavior and deviate from unity in 
atmospheric conditions by less than 1 part in 10°. These compressibility factors are 
given by Owens (1967) but are usually assumed to be equal to unity and their effects 
absorbed into the K coefficients. 

The first and second terms in Eq.(13.71) are due to ultraviolet electronic 
transitions of the induced dipole type for dry-air molecules and water vapor, 
respectively, and the third term is due to the permanent dipole infrared rotational 
transitions of water vapor. The best values of the parameters are Kı = 77.6898, 
Ky = 71.2952, and K3 = 375463, based on a weighted average of all available 
experimentally derived values before 2002, as presented by Riieger (2002). These 
values were the result of working groups of the IUGG and the IAG. Thus, as in 
Eq. (13.6), 


PD Pv Pv 
N = 77.6898 — 1.2952— 463—. 13.72 
77.68 87 +7 5 7 + 37546377 (13.72) 
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The accuracy of this expression at the zero-frequency limit is conservatively 
estimated to be 0.02% for the pp term and 0.2% for the py terms. We can rewrite 
Eq. (13.72) in terms of the total pressure (P = pp + py) as 


P Pv Py 
N = 77.7— — 6.4— + 375463—. 13.73 
T T a T2 ( ) 


For temperatures around 280 K, the last two terms on the right side of Eq. (13.73) 
can be combined to give the well-known two-term Smith-Weintraub equation 
(Smith and Weintraub 1953) that has been widely used in the radio science 
community. Using the best available parameters in 1953, this equation is 


N= (P + 481022) , (13.74) 


The accuracy of Eqs. (13.73) and (13.74) at frequencies above zero can be improved 
by adding a small term that increases monotonically with frequency to account for 
the effect of the wings of the infrared transitions (see Fig. 13.9). Hill and Clifford 
(1981) show that because of this effect, the wet refractivity increases by about 0.5% 
at 100 GHz, and 2% at 200 GHz, over its value at low frequencies. 

It is interesting to compare the refractivities at radio and optical wavelengths. 
The term proportional to T~ is due to the infrared resonances of H2O, because of 
its permanent dipole moment, and does not affect the optical refractivity. On the 
other hand, the terms proportional to T~! arise from the induced dipole moments 
associated with resonances of oxygen and nitrogen and also water vapor in the 
ultraviolet. Hence, to a first approximation, we estimate the optical refractivity by 
omitting the permanent dipole term from Eq. (13.72) and obtain 


PD Pv 
Nop > 77.7— + 71.3— . 13.75 
pt 7T 7 ( ) 


For precise work, Cox (2000) and Rüeger (2002) provide more accurate values for 
Nop that include small terms having wavelength dependence to account for the 
effects of the wings of ultraviolet transitions that cause it to increase about 3% 
going from 1 to 0.3 um. The ratio of the wet refractivity in the radio and optical 
regions is obtained by omitting the dry-air terms from Eqs. (13.72) and (13.75): 
Nvraa/Nvopt ~ 1 + 5830/T. For T ~ 280K, this ratio is about equal to 22. Hence, 
water vapor plays a much more prominent role in propagation issues in radio than 
in optical astronomy. 


13.1.6 Phase Fluctuations 


In the radio region, the most important nonuniformly distributed quantity in the 
troposphere is the water vapor density. Variations in water vapor distribution in 
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Fig. 13.10 A cartoon of a two-element interferometer beneath a tropospheric screen of water 
vapor irregularities of various sizes. The screen moves over the interferometer at a velocity 
component v, parallel to the baseline. The distribution of these irregularities is important in 
designing the phase compensation schemes discussed in Sect. 13.2. Note that fluctuations with 
scale sizes larger than the baseline cover both antennas and do not affect the interferometer phase 
significantly. From Masson (1994a), courtesy of and © the Astronomical Society of the Pacific. 


the troposphere that move across an interferometer cause phase fluctuations that 
degrade the measurements. In the optical region, variations in temperature, rather 
than in water vapor content, are the principal cause of phase fluctuations. The 
situation is depicted in Fig. 13.10. A critical dimension is the size of the first 
Fresnel zone, VAh, where A is the distance between the observer and the screen. 
For A = 1cm and h = 1km, the Fresnel scale is about 3 m. The atmospherically 
induced phase fluctuations on this scale are very small (« 1 rad). In this case, 
the phase fluctuation can cause image distortion but not amplitude fluctuation (i.e., 
scintillation). This is known as the regime of weak scattering. Plasma scattering 
in the interstellar medium belongs to the regime of strong scattering, where the 
phenomena are considerably more complex (see Sect. 14.4). 

The fluctuations along an initially plane wavefront that has traversed the atmo- 
sphere can be characterized by a so-called structure function of the phase. This 
function is defined as 


D; (d) = (Eœ) — $x- dP) , (13.76) 


where (x) is the phase at point x, (x — d) is the phase at point x — d, and 
the angle brackets indicate an ensemble average. In practical applications, the 
ensemble average must be approximated by a time average of suitable duration. 
We assume that Dg depends only on the magnitude of the separation between the 
measurement points, that is, the projected baseline length d of the interferometer. 
The rms deviation in the interferometer phase is 


op = Do(d). (13.77) 


682 13 Propagation Effects: Neutral Medium 


For the sake of illustration, we assume a simple functional form for og given by 


op = a2 d< dp, (13.78a) 
and 

Op = Om , d>dy, (13.78b) 
where a is a constant, and Om = Qnadh/d. The form of Eqs. (13.78) is shown 


in Fig. 13.11a. This form can be derived by assuming a multiple-scale power-law 
model for the spectrum of the phase fluctuations. There is a limiting distance dm 
beyond which fluctuations do not increase noticeably, a few kilometers, roughly the 
size of clouds. This limit is called the outer scale length of the fluctuations. Beyond 
this dimension, the fluctuations in the path lengths become uncorrelated. 

First, consider an interferometer that operates in the domain of baselines shorter 
than dm. The measured visibilities V, are related to the true visibilities V by the 
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Fig. 13.11 (a) Simple model for the rms phase fluctuation induced by the troposphere in an 
interferometer of baseline length d given by Eqs. (13.78). (b) The point-source response function 
Wa(0) for various power-law models is obtained by taking the Fourier transform of the visibility 
in the regime d < dm. The values of 6,, the full width at half-maximum of W,(@), for each model 
are: Gaussian (8 = 1), 81n2a; modified Lorentzian ($ = 3) 1.532A—!a’; and Kolmogorov 
(B = 3), 2.75A—!/5.a/5. À is the wavelength and a is the constant defined in Eq. (13.78a). 
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equation 
Vin = Vel? , (13.79) 


where ¢ = (x) — ®(x — d) is a random variable describing the phase fluctuations 
introduced by the atmosphere. If we assume ġ is a Gaussian random variable with 
zero mean, then the expectation of the visibility is 


(Vin) = Vle) = Ve!” = Yed, (13.80) 


Consider the conceptually useful case in which 6 = 1. It would arise in an 
atmosphere consisting of inhomogeneous wedges of scale size larger than the 
baseline. In this case, oy is proportional to d, and the constant a is dimensionless. 
Substituting Eq. (13.78a) into Eq. (13.80) yields 


2 42 we 


(Vm) = Ve er (13.81) 


where q = Ju? + v? = d/d. On average, therefore, the measured visibility is the 
true visibility multiplied by an atmospheric weighting function w,(q) given by 


2 aga 


walg) = eT, (13.82) 


In the image plane, the derived map is the convolution of the true source distribution 
and the Fourier transform of w4 (q), which is 


AORTE (13.83) 


where 0 is here the conjugate variable of q. The full width at half-maximum of 
Wa(6) is Os, given by 


bs = V8In2a. (13.84) 


Thus, the resolution is degraded because the derived map is convolved with a 
Gaussian beam of width 0, (in addition to the effects of any other weighting 
functions, as described in Sect. 10.2.2). 0, is the seeing angle. Images with finer 
resolution than 9, can often be obtained by use of adaptive calibration procedures 
described in Sect. 11.3. Now, from Eq. (13.78a), we obtain 


a 
oe. (13.85) 
Ind d 


where og = 0gA/2z is the rms uncertainty in path length. Thus, we obtain 


6, = 2.35 (radians) . (13.86) 
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Since o4/d is constant, 0, is independent of wavelength. This independence results 
from the condition 6 = 1 in Eq. (13.78a). For the radio regime, og is about 1 mm 
on a baseline of 1 km, so a ~ 107% and 6, ~ 0.5”. Let dọ be the baseline length for 
which og = | rad. From Eq. (13.85), we see that Eq. (13.84) can be written in the 
form 


,/2in2 à à 
ga AEDA OTE (13.87) 
T do do 


For the case in which $ is arbitrary, we find w,(@) by substituting Eq. (13.78a) 
into Eq. (13.80) and writing the two-dimensional Fourier transform as a Hankel 
transform (Bracewell 2000). Thus 


Wa (0) « f exp |-2r°a AP- @P) J240) qdq , (13.88) 
0 


where Jo is the Bessel function of order zero and a has dimensions cm~), In 
general, Wa(0) cannot be evaluated analytically. However, by making appropriate 
substitutions in Eq. (13.88), it is easy to show that 6, œ a!/8A(P-)/8 | A case that 
can be treated analytically is the one for which 6 = $. In this case, we obtain 
(Bracewell 2000, p. 338) 


1 


Wa l0) « ————_—_—_—_—____. , 
Wa(A) x (e+ ray 


(13.89) 


which represents a Lorentzian profile raised to the 3/2 power and has very broad 
skirts. The full width at half-maximum of w4(0) is 


1. 2 
ga R. (13.90) 
2 
or 
0.77 A 2 
p= A na 13.91 
27 do do ( ) 


In the case of Kolmogorov turbulence, which is discussed later in this section, 6 = 
5/6. Numerical integration of Eq. (13.88) yields 


À 
6, ~ 2.75a 5171 ~ 0.307- , (13.92) 
0 


Plots of W,(6) for various power-law models of phase fluctuations are shown in 
Fig. 13.1 1b. 
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Now consider the case of an interferometer operating in the domain of baselines 
greater than dm, where og is a constant equal to om. This case is most applicable 
to VLBI arrays or to large connected-element arrays. If the timescale of the 
fluctuation is short with respect to the measurement time, then, on average, all the 
visibility measurements are reduced by a constant factor e~°n/2. Thus, this type of 
atmospheric fluctuation does not reduce the resolution. However, on average, the 
measured flux density is reduced from the true value by the factor e7 m/2_ Tf the 
timescale of the fluctuations is long with respect to the measurement time, then 
each visibility measurement suffers a phase error e/?. Assume that K visibility 
measurements are made of a point source of flux density S. The image of the source, 
considering only one dimension for simplicity, is 


K 
S se 
wa(8) = z ` elFigimuid (13.93) 
i=1 


The expectation of W,(0) at 0 = Ois 
(Wa(0)) = Se? , (13.94) 


The measured flux density is less than S. (Note: (W,(0)) /S is sometimes called the 
coherence factor of the interferometer.) The missing flux density is scattered around 
the map. This is immediately evident from Parseval’s theorem: 


1 
Dm = v= e. (13.95) 


Thus, the total flux density could be obtained by integrating the square of the image- 
plane response. The rms deviation in the flux density, measured at the peak response 


for a source at 0 = 0, is ,/ (W2(0)) — (W4(0))?, which we call os. This quantity can 
be calculated from Eq. (13.93) and is given by 


S ; 
oy = — V 1 = eh, (13.96) 
JK 


which reduces to os ~ Som/ ~K when On X 1. 


13.1.7 Kolmogorov Turbulence 


The theory of propagation through a turbulent neutral atmosphere has been treated 
in detail in the seminal publications of Tatarski (1961, 1971). This theory has been 
developed and applied extensively to problems of optical seeing [e.g., Roddier 
(1981), Woolf (1982), Coulman (1985)] and to infrared interferometry (Sutton et al. 
1982). We confine the discussion here to a few central ideas concerning the structure 
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function of phase and indicate how it is related to other functions that are used to 
characterize atmospheric turbulence. 

When the Reynolds number (a dimensionless parameter that involves the vis- 
cosity, a characteristic scale size, and the velocity of a flow) exceeds a critical 
value, the flow becomes turbulent. In the atmosphere, the Reynolds number is nearly 
always high enough that turbulence is fully developed. In the Kolmogorov model 
for turbulence, the kinetic energy associated with large-scale turbulent motions 
is transferred to smaller and smaller scale sizes of turbulence until it is finally 
dissipated into heat by viscous friction. If the turbulence is fully developed and 
isotropic, then the two-dimensional power spectrum of the phase fluctuations (or the 
refractive index) varies as qs 1/ 7 where qs (cycles per meter) is the spatial frequency 
(qs, the conjugate variable of d, is analogous to q, the conjugate variable of 6). The 
structure function for the refractive index D,,(d) is defined in a fashion similar to 
the structure function of phase in Eq. (13.76); that is, D,(d) is the mean-squared 
deviation of the difference in the refractive index at two points a distance d apart, or 
D,(d) = ({n(x) — n(x — d)]*). Note that only the scalar separation d is important for 
isotropic turbulence. For the conditions stated above, D, can be shown to be given 
by the equation 


Dd) = CPP, du <d<X dow, (13.97) 


where din and doy are called the inner and outer scales of turbulence, which may 
be less than a centimeter and a few kilometers, respectively. The parameter C? 
characterizes the strength of the turbulence. Note that water vapor, which is the 
dominant cause of fluctuation in the index of refraction, is poorly mixed in the 
troposphere and therefore may be only an approximate tracer of the mechanical 
turbulence. 

The details of the derivation structure function of phase from the structure 
function of the index of refraction given in Eq. (13.97) are given in Appendix 13.2. 
The result is that Dg(d) for a uniform layer of turbulence of thickness L has several 
important power-law segments: 


Dld) ~ a, d. <d<d, 
~e., dı < d < dou , 
~d’, dom <d. (13.98) 


d, is the limit where diffractive effects become important. d, ~ VLA, so for 
an atmospheric layer of L = 2 km, d, varies from 1.4 to 40m for A ranging 
from 1mm to 1m. This inner turbulence scale din is considerably smaller and of 
interest only at optical wavelengths. d) marks the transition from 3-D turbulence 
and 2-D turbulence caused by the thickness of the layer. Stotskii (1973 and 1976) 
was the first to recognize the importance of this break for radio arrays (see also 
Dravskikh and Finkelstein 1979). dout is the distance beyond which the fluctuations 


13.1 Theory 687 


are uncorrelated, as described in Sect. 13.1.6. dout is nominally the scale size of 
clouds, a few kilometers. However, some correlation remains out to the scale size of 
weather systems and beyond. 

The structure function is formally an ensemble average. For practical purposes, 
the turbulent eddies are assumed to remain fixed as the atmospheric layer moves 
across an array. This is the frozen-screen hypothesis, sometimes attributed to Taylor 
(1938). Practically, the rms fluctuations in phase increase with time up to the cross 
time te = d/vs, where vs is the wind speed parallel to the baseline direction 
corresponding to d. te is called the corner time, beyond which the rms fluctuations 
flatten out and Dg(d) can be estimated. Atmospheric fluctuations on scales larger 
than d cover both receiving elements and do not contribute to the structure function. 
An example of the structure function as a function of time measured at the 
ALMA site by the 300-m satellite site-testing interferometer is shown in Fig. 13.12. 
te ~ 20s, implying a wind speed of about 15m s7!. 
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Fig. 13.12 The rms phase deviation at the ALMA site, measured with a satellite site-testing 
interferometer with a 300-m baseline at 11 GHz. The open symbols represent actual measurements; 
the solid symbols have the instrument noise removed. The line through the data has a slope of 0.6, 
as approximately expected by Kolmogorov theory [see Eq. (13.108)]. The break in the slope of the 
data occurs at the instrument crossing time, te. An estimate of the ensemble average of the structure 
function is reached for t > te. From Holdaway et al. (1995a). 
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We continue this section with a discussion of the primary case of 3-D turbulence 
in which Dg ~ d?’ and then generalize the results for other power-law indices. As 
derived in Appendix 13.2, for a uniform turbulence layer, 


27 \ 2 5/3 
Dad) = 2.91( =) GLa’ , (13.99) 


which is valid in the range JL « d X L. The lower limit on d is equivalent 
to the requirement that diffraction effects be negligible. Note that the factor 2.91 
is a dimensionless constant, and CŽ has units of length~?/>. This factor appears in 
calculations based on D,, as defined in Eq. (13.97). (It is sometimes absorbed into 
C2) 

We can generalize Eq. (13.99) for a stratified turbulent layer. The structure 
function of phase for an atmosphere in which CŽ varies with height from the surface 
to an overall height L is given by 


Qn \? L 
Dg(d) = 2.91 (=) an f C?(h) dh . (13.100) 
0 


The rms phase deviation is the square root of the phase structure function, or, when 


C? is a constant, 
27 5/6 
og = 1.71 FE  C2Ld . (13.101) 


The baseline length for which og = 1 rad is defined as dp and is given by 
do = 0.058499 (CL . (13.102) 


Another scale length that is proportional to do is the Fried length, dy (Fried 1966). 
This scale is particularly useful for discussions of the effects of turbulence in 
telescopes with circular apertures and is widely used in the optical literature. 
The structure function of phase can be written as Dg = 6.88(d/ dy)?! 3, where 
the factor 6.88 is an approximation of 2[(24/5)I'(6/5)]°/° (Fried 1967). Hence, 
from Eqs. (13.99) and (13.102), dy = 3.18do. The Fried length is defined such that 
the effective collecting area of a large circular aperture with uniform illumination in 
the presence of Kolmogorov turbulence is md; /4. Hence, for an aperture of diameter 
small with respect to dy, the resolution is dominated by diffraction at the aperture. 
With an aperture large with respect to dr, the resolution is set by the turbulence and 
is approximately À /d;. The exact resolution in this latter case can be derived from 
Eq. (13.92), with the result 6, = 0.97A/dy. In addition, the rms phase error over an 
aperture of diameter dy is 1.01 rad. The reason that dy is larger than do is related to 
the downweighting of long baselines in two-dimensional apertures [see Eq. (15.13) 
and related discussion]. For an aperture of diameter dp, the ratio of the collecting 
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area to the geometric area, which is called the Strehl ratio in the optical literature, is 
equal to 0.45 (Fried 1965). 

Equation (13.102) shows that do is proportional to 46/5, and thus the angular 
resolution or seeing limit (~ A/do) is proportional to A~!/> [see Fig. 13.11 and 
Eq. (13.92)]. This relationship may hold over broad wavelength ranges when C? is 
constant. In the optical range, CŽ is related to temperature fluctuations, whereas in 
the radio range, C? is dominated by turbulence in the water vapor. It is an interesting 
coincidence that the seeing angle is about 1” at both optical and radio wavelengths, 
for good sites. The important difference is the timescale of fluctuations, Ter. If the 
critical level of fluctuation is 1 radian, then Ter ~ do/v;, where vs is the velocity 
component of the screen parallel to the baseline. Any adaptive optics compensation 
must operate on a timescale short with respect to Ter. From Eq. (13.92), Ter can be 
expressed as 


À 
Ter œ 0.3 ‘ (13.103) 


sUs 


For v, = 10ms7! and 6, = 1”, Te = 3 ms at 0.5 um wavelength and 60s at 1 cm 
wavelength. 

The two-dimensional power spectrum of phase, S2(qx, gy), is the Fourier trans- 
form of the two-dimensional autocorrelation function of phase, Rg (dx, dy). If Rg 
is a function only of d, where C= d + dy, then S2 is a function of qs, 
where q? = q? + È, and S2(qs) and Rg(d) form a Hankel transform pair. Since 
Dg(d) = 2[R¢ (0) — Ry (d)], we can write 


Do(d) = tx | [1 — Jo(2mqsd)] S2(4s)4s das , (13.104) 


where Jo is the Bessel function of order zero. When Dg (d) is given by Eq. (13.100), 
S2(qs) is 


Qn \? 
S2(qs) = 0.0097 (+) Cie. (13.105) 

It is often useful to study temporal variations caused by atmospheric turbulence. 
In order to relate the temporal and spatial variations, we invoke the frozen-screen 
hypothesis. The one-dimensional temporal spectrum of the phase fluctuations Sy (f) 
(the two-sided spectrum) can be calculated from S2(q;) by 


1 lo) 
S) = >f S2 (a = La) dqy , (13.106) 


(oe) S 
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where vs is in meters per second. Substitution of Eq. (13.105) into Eq. (13.106) 
yields 


2 2 
S, (f) = 0.016 (=) C2Lv3¢-8/3 (rad? Hz!) . (13.107) 


Examples of the temporal spectra of water vapor fluctuations can be found in Hogg 
et al. (1981) and Masson (1994a) (see Fig. 13.17). The temporal structure function 
D,(t) = ([o(t) — ¢ (t — T)]’) is related to the spatial structure function by D,(t) = 
Dg(d = vst). Hence, for Kolmogorov turbulence, we obtain from Eq. (13.99) 


2 2 
D,(t) = 2.91 (=) CPT, (13.108) 


D,(t) and Sy (f) are related by a transformation similar to Eq. (13.104). The use of 
temporal structure functions to estimate the effects of fluctuations on interferometers 
is discussed by Treuhaft and Lanyi (1987) and Lay (1997a). 

The Allan variance op (T), or fractional frequency stability for time interval t, 
associated with Sy (f) has been defined in Sect. 9.5.1. It can be calculated by 
substituting Eq. (9.119) into Eq. (9.131), which gives 


o (t) = ( 


By substituting Eq. (13.107) into Eq. (13.109), and noting that 


2 
TVOT 


2 oo 
) f Sy) sin’ (tf) df . (13.109) 
0 


f [sinf (srx)]/x°/ dx = 4.61 , 
0 


we obtain 
CaS 1.3 x 1071 Chere, (13.110) 


Armstrong and Sramek (1982) give general expressions for the relations among 
S, Sy» Dg, and o, for an arbitrary power-law index. If S2 x q™, then Dg(d) « 
d*~, Sy xf l-e and o? œ t%~4, These relations are summarized in Table 13.2. 

The actual behavior of the atmosphere is more complex than described above, but 
the theory developed provides a general guide. An example of a structure function of 
phase from the VLA is shown in Fig. 13.13 (see also Fig. 13.22 for a similar plot for 
ALMA). It clearly shows the three power-law regions, with power-law exponents 
close to their expected values. The effects of phase noise on VLBI observations are 
discussed by Rogers and Moran (1981) and Rogers et al. (1984). The plot of Allan 
variance by Rogers and Moran is shown in Fig. 9.17. 
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Table 13.2 Power law relations for turbulence 
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Exponent 
3-D 2-D 
turbulence turbulence 
Quantity (a = 11/3) (a = 8/3) 
2-D, 3-D power spectrum So(qs), S(Qs) —a —11/3 —8/3 
Structure function D3(d) a—2 5/3 2/3 
Temporal phase spectrum pf) l-a —8/3 —5/3 
Allan variance o}(t) a—4 —1/3 —4/3 
Temporal structure function D,(t) a—2 5/3 2/3 
Adapted from Wright (1996, p. 526). 
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Fig. 13.13 The root phase structure function (rms phase) from observations with the VLA at 
22 GHz. The open circles show the rms phase variation vs. baseline length measured on the source 
0748+240 over a period of 90 min. The filled squares show the data after removal of a constant 
receiver-induced noise component of rms amplitude 10°. The three regimes of the phase structure 
function are indicated by vertical lines (at 1.2 and 6km). Note that 8 = a/2. From Carilli and 
Holdaway (1999). © 1999 by the American Geophys. Union. 
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13.1.8 Anomalous Refraction 


The beamwidths of many millimeter radio telescopes are sufficiently small that 
the effect of atmospheric phase fluctuations can be detected. This effect was first 
noticed with the 30-m-diameter millimeter-wavelength telescope on Pico de Veleta, 
where the apparent positions of unresolved sources were observed to wander by 
about 5” on timescales of a few seconds under certain meteorological conditions 
[see, e.g., Altenhoff et al. (1987), Downes and Altenhoff (1990), and Coulman 
(1991)]. This motion is due to the flow of the turbulent layer of water vapor across 
the telescope aperture, which is distinct from the refraction caused by the quasi- 
static atmosphere, and hence the term “anomalous refraction.” This effect can be 
understood by a simple application of the theory developed in Sect. 13.1.7. The 
magnitude of the refraction is dominated by the turbulent cells of size equal to the 
diameter of the antenna. These cells can be thought of as refractive wedges moving 
across the aperture of the antenna. The rms value of the differential phase shift of 
such a wedge is equal to the square root of the structure function evaluated at the 
separation distance equal to the diameter of the antenna, ,/D4(d). Hence, the rms 
value of the anomalous refraction for an observation at zenith is given by 


pees eae (13.111) 


d 


where the structure function is in units of length and the factor of two accounts for 
motion in both azimuth and elevation. Note that fluctuations on larger scales than d 
are unimportant as long as the power-law exponent on the structure function is less 
than two, as is usually the case. In the 3-D turbulence case, € will vary as ./sec z. 
If we express the rms phase fluctuations as © = 09(d/100m)°/° (see Table 13.4 for 
values of dọ), then the ratio of the anomalous refraction angle to the beamwidth, 
0, ~ 1.2A/d, is 


an d \5/6 


For example, the range of seasonal median values for og for the ALMA site is 
0.045-0.17mm. Since the diameter of the ALMA antennas is 12m, the range of 
€ is 0.2-0.6” from Eq. (13.111), which is independent of wavelength. The timescale 
of this effect is d/vs, where vs is the wind speed. At a wavelength of 1 mm, the 
beamwidth is about 20”, so the ratio €/0, has the range of 1% to 3%. There is no 
effect on the amplitude of the incident electric field because the phase fluctuations 
arise in a layer close to ground. However, the fractional changes in antenna gain 
at the half-beamwidth point would range from 1.5% to 5%, which could have an 
effect on the quality of mosaic images derived from array observations under some 
conditions. For further details, see Holdaway and Woody (1998). Methods of real- 
time correction of anomalous refraction have been proposed by Lamb and Woody 
(1998). 
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13.2.1 Opacity Measurements 


At millimeter and submillimeter wavelengths, absorption and path length fluctu- 
ations in the atmosphere limit performance in synthesis imaging. This section is 
concerned with monitoring of atmospheric parameters for optimum choice of sites 
and with methods of calibrating the atmosphere to reduce phase errors. This subject 
has received much attention as a result of the development of major instruments at 
millimeter and submillimeter wavelengths. 

For given atmospheric parameters, the zenith opacity (optical depth) tọ can be 
calculated as a function of frequency using the propagation models of Liebe (1989), 
Pardo et al. (2001a), or Paine (2016). Figure 13.14 shows curves of transmission, 
exp(—t), for 4mm of precipitable water at an elevation of 2124m and 1 mm at 
5000 m, corresponding to the VLA and ALMA sites, respectively. For the purpose of 
choosing a suitable observatory site, detailed monitoring of the atmosphere covering 
both diurnal and annual variation is necessary. We assume that the zenith opacity has 
the form 


Tt, =A, + Bw, (13.113) 


where A, and B, are empirical constants that depend on frequency, site elevation, 
and meteorological conditions. Selected measurements of these constants are given 
in Table 13.3. 

The opacity can be monitored by measuring the total noise power received in a 
small antenna as a function of zenith angle (1.e., the tipping-scan method described 
in Sect. 13.1.3). A commonly used frequency for opacity monitoring is 225 GHz, 
which lies within the 200-310 GHz atmospheric window (see Figs. 13.7 and 13.14) 
in the vicinity of the important CO 2-1 rotational transition at 230 GHz. 

A typical site-test radiometer designed for opacity measurements uses a small 
parabolic primary reflector with a beamwidth of ~ 3° at 225 MHz. A wheel with 
blades that act as plane reflectors is inserted at the beam waist between the primary 
and secondary reflectors and sequentially directs the input of the receiver to the 
output of the antenna, a reference load at 45 °C, and a calibration load at 65 °C. The 
amplified signals go to a power-linear detector and then to a synchronous detector 
that produces voltages proportional to the difference between the antenna and the 
45°C load, which is the required output, and the difference between the 45° and 
65 °C loads, which provides a calibration. Measurements of the antenna temperature 
are made at a range of different zenith angles. When connected to the antenna, the 
measured noise temperature of such a system, Tmeas, consists of three components: 


Tneas = Teonst + Ty (1 =e" secz) T Tembe ” maar (13.114) 
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Fig. 13.14 (top) The zenith atmospheric transmission (equal to e") at the ALMA site (5000 m, 
with 1 mm of precipitable water vapor). There are additional windows with transmissions of about 
0.3% near 1100, 1300, and 1500 GHz. There are no additional windows with higher transmission 
up to 7500 GHz (40 um). (bottom) The zenith transmission at the VLA site (2124 m, with 4mm 
of precipitable water). Note that the transmission depends on the altitude because of the pressure 
broadening of the absorption lines. Because of this effect, for a fixed value of w, the transmission 
at any frequency in an atmospheric window will be lower for lower altitude sites. The many 
narrow absorption features (line widths of ~ 100 MHz) are caused by stratospheric ozone lines 
[for a catalog of these lines, see Lichtenstein and Gallagher (1971)]. The effects of these lines 
in astronomical observations can be removed by careful bandpass calibration. These transmission 
plots were calculated with the am code (Paine 2016) with profiles for mean midlatitude atmospheric 
conditions. 


Here Teonst represents the sum of noise components that remain constant as the 
antenna elevation is varied, that is, the receiver noise, thermal noise resulting from 
losses between the antenna and the receiver input, any offset in the radiometer 
detector, and so on. The second term in Eq. (13.114) represents the component of 
noise from the atmosphere: Tat is the temperature of the atmosphere, and z is the 
zenith angle. Temb © 2.7 K represents the cosmic microwave background radiation. 
It will be assumed that Tat and Temp represent brightness temperatures that are related 
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Table 13.3 Zenith opacity as a function of column height of water vapor 


v Altitude Ay B, 

(GHz) Location* (m) (nepers) (nepers mm 1) Method” Ref.: 
15 Sea level 0 0.013 0.002 1 1 
22.2 Sea level 0 0.026 0.02 1 1 
35 Sea level 0 0.039 0.006 1 1 
90 Sea level 0 0.039 0.018 1 1 

225 South Pole 2835 0.030 0.069 2 2 

225 Mauna Kea 4070 0.01 0.04 2 3 

225 Chajnantor 5000 0.006 0.033 2 4 

225 Chajnantor 5000 0.007 0.041 2 5 

493 South Pole 2835 0.33 1.49 2 6 


Locations: South Pole = Amundsen-Scott station; Mauna Kea = site of submillimeter telescopes 
on Mauna Kea; Chajnantor = Llano de Chajnantor, Atacama Desert, Chile. 

>Methods: (1) opacity derived from radiosonde data, water vapor estimated from surface humidity 
and scale height of 2 km; (2) opacity derived from tipping radiometer, water vapor column height 
derived from radiosonde data. 

“References: (1) Waters (1976); (2) Chamberlin and Bally (1995); (3) Masson (1994a); (4) Hold- 
away et al. (1996); (5) Delgado et al. (1998); (6) Chamberlin et al. (1997). 


to the physical temperatures by the Planck or Callen and Welton formulas (see 
Sect. 7.1.2). If Ta is known, it is straightforward to determine to from Tmeas as a 
function of z. The temperature of the atmosphere is assumed to fall off from the 
ambient temperature at the Earth’s surface Tamb, with a lapse rate / considered to 
be constant with height. Thus, at height h, the temperature is Tamb — lh. We require 
the mean temperature weighted in proportion to the density of water vapor, which 
is exponentially distributed with scale height ho: 


CO 
if he"! dh 
Te = Ty = — <= = Tray = Thy (13.115) 


[0.6] 
1 el/ho dh 
0 


The lapse rate resulting from adiabatic expansion of rising air, 9.8 K km7!, can be 
used as an approximate value, but as indicated earlier, a typical measured value is 
~ 6.5K km™!. The scale height of water vapor is approximately 2 km. Thus, Tat is 
typically less than Tamb by ~ 13-20K. 

Figure 13.15 displays examples of data taken on Mauna Kea, which show the 
diurnal and seasonal effects at this site. The cumulative distribution of zenith opacity 
at 225 and 850 GHz as measured at Cerro Chajnantor, Llano de Chajnantor in Chile; 
Mauna Kea; and the South Pole are shown in Fig. 13.16. Measurements of mean 
opacity provide a basis for calculating the loss in sensitivity due to absorption of 
the signal and the addition of noise from the atmosphere [see Eq. (13.50)]. The 
opacity varies both diurnally and annually, so measurements at hourly intervals 
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Fig. 13.15 (a) Diurnal and seasonal zenith opacity at 225 GHz measured at the CSO site on 
Mauna Kea (4070-m elevation) for a three-year period (August 1989-July 1992) computed from 
14,900 measurements. The minimum value and the 25th, 50th, and 75th percentiles are shown. The 
increase in opacity during the day is caused by an inversion layer that rises above the mountain 
in the afternoons. (b) Diurnal and seasonal variation of the rms path length on Mauna Kea on a 
100-m baseline, determined from observations of a geostationary satellite at 11 GHz. From Masson 
(1994a), courtesy of and © the Astronomical Society of the Pacific. 


over a year or more are required for reliable comparison of different sites. Long- 
term variability due to climatic effects (e.g., El Niño) can be significant. Table 13.3 
shows the effect of site altitude on opacity. Comparison of the measurements of A, 
and B, show that both parameters decrease with altitude because of the effects of 
pressure broadening. Comparisons of opacities at various frequencies can be made 
with broadband Fourier transform spectrometers (Hills et al. 1978; Matsushita et al. 
1999; Paine et al. 2000; Pardo et al. 2001b). 
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Fig. 13.16 Cumulative distributions of the zenith optical depth at 850 GHz (left panel) and at 
225 GHz (right panel) at Cerro Chajnantor, Chile (5612-m elevation); the ALMA site on Llano 
de Chajnantor, Chile (5060-m elevation); the CSO on Mauna Kea, Hawaii (4100-m elevation); and 
the South Pole (2835-m elevation) for the periods April 1995—April 1999, Jan. 1997—-July 1999, 
and Jan. 1992—Dec. 1992, respectively. Note that the median opacity at 225 GHz at the VLBA 
site on Mauna Kea (3720-m elevation) for the same time interval at the CSO site was 0.13. The 
median opacity for the VLA site (2124-m elevation) for the period 1990-1998 was 0.3 (Butler 
1998). Conditions at lower elevation sites are correspondingly worse. For example, at a sea-level 
site in Cambridge, MA, the 225-GHz opacity, inferred from measurements at 115 GHz, was 0.5 for 
the six-month winter observing seasons spanning 1994-1997. Conditions on Dome C, Antarctica 
(3260-m elevation), are significantly better than at the South Pole (Calisse et al. 2004), and Ridge A 
on the Antarctic high plateau (4050-m elevation) may have the lowest water vapor on the planet 
(Sims et al. 2012). Adapted from Radford and Peterson (2016). 


13.2.2 Site Testing by Direct Measurement of Phase Stability 


Interferometer observations provide a direct method of determining atmospheric 
phase fluctuations. Signals from a geostationary satellite are usually used, since 
strong signals can be obtained using small, nontracking antennas. This technique, 
called satellite-tracking interferometry (STI), was developed by Ishiguro et al. 
(1990); Masson (1994a); and Radford et al. (1996). It was used in site test- 
ing for the SAO Submillimeter Array on Mauna Kea, Hawaii, Atacama Large 
Millimeter/submillimeter Array at Llano de Chajnantor, and potential SKA sites. 
Several suitable geostationary-orbit satellites operate in bands allocated to the fixed 
and broadcasting services near 11 GHz. Two commercial satellite TV antennas of 
diameter 1.8 m provide signal-to-noise ratios close to 60 dB. For measurements of 
atmospheric phase, baselines of 100—300m have been used. The residual motion 
of the satellite, as well as any temperature variations, can cause unwanted phase 
drifts. These are generally slow compared with the atmospheric effects and can be 
removed by subtracting a mean and slope from the output data. The variance of the 
fluctuations resulting from the system noise can also be determined and subtracted 
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from the variance of the measured phases. The test interferometer provides a 
measure of the structure function of phase Dg (d) for one value of projected baseline 
d (see Fig. 13.15b). 

It is sometimes useful to compare the quality of sites based on STI measurements 
made with different baselines and zenith angles. For baselines in the vicinity of 
100 m (see also Fig. A13.4), a reasonable scaling is 


Og ~ ood?!*./secz . (13.116) 


For longer baselines, other power laws will be more appropriate. 

With the frozen-screen approximation, the power-law exponent can be deter- 
mined from the power spectrum of the fluctuations. An example is shown in 
Fig. 13.17 (see also Bolton et al. 2011). Thus, in extrapolating Dg(d) from a single- 
spacing measurement, one does not have to depend on the theoretical values of the 
exponent of d but can use the measurements of Dg(t) to determine the range and 
variation [see Eq. (13.108) and Table 13.2]. For the example shown in Fig. 13.17, the 
power-law slope for frequencies above 0.01 Hz is 2.5, slightly below the value of 8/3 
or 2.67 predicted for Kolmogorov turbulence. The spectrum flattens at frequencies 
below 0.01 Hz because of the filtering effect of the interferometer. Fluctuations 
larger than the baseline, 100 m in this case, cause little phase effect. For the corner 
frequency fe = vs/d, the wind speed along the baseline direction can be inferred to 
be about 1 m s™!. 

Table 13.4 shows a compilation of the measurements of the structure function 
referred to a baseline of 100 m. The range of values reported for a fiducial baseline 
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Fig. 13.17 The square root of the temporal power spectrum [i.e., Eq. (13.107)] measured on a 
100-m baseline on Mauna Kea (CSO site). The tropospheric wind speed along the baseline can 
be computed from the break in the spectrum. From Masson (1994a), courtesy of and © the 
Astronomical Society of the Pacific. 
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Fig. 13.18 Rms path length vs. site elevation referred to zenith and baseline of 100 m. Data taken 
in good weather conditions, i.e., winter nighttime (see Table 13.4 for station identifiers). Straight 
line is best-fit exponential with scale height of 2200 m and sea level intercept of 0.15. 


of 100m are meant to reflect the median conditions at night during the winter and 
daytime during the summer. The measurements were obtained either by satellite 
interferometry or astronomical measurements. A plot of the rms phase noise vs. 
site altitude for “best conditions” is presented in Fig. 13.18. The decrease of rms 
phase noise vs. altitude is evident. With the assumption that the turbulence, i.e., 
C2, is proportional to water vapor density and that the water vapor is distributed 


n?’ 


exponentially with a scale height of ho, we obtain from Eq. (13.100) the result 
o = og M70 | (13.117) 


(The factor of 2 arises from the fact that og = Dy . The line shown is a fit to this 
equation.) The value of ho is 2.2 km, close to the nominal scale height of 2 km, and 
oo = 0.05 mm. Because the distillation of this information from disparate sources 
is difficult, the results are meant to show the importance of altitude rather than make 
small distinctions among observatories. Local conditions can also be important. See 
Masson (1994b) for further discussion. 

A wide range of power-law indices has been observed (see Table 13.4). Much of 
the variation between 0.33 and 0.833 is due to the effects of thin scattering layers 
in the troposphere, which effectively moves or blurs the crossover from 2-D to 3-D 
turbulence (see Bolton et al. 2011). Beaupuits et al. (2005) explored this problem 
by pointing two 183-GHz water-vapor radiometers so that their beam intersected at 
an altitude of about 1500 m. By analyzing the delay between the radiometer signals, 
they identified a significant turbulent layer near 600 m. 

The atmospheric phase noise, if left uncorrected, causes a coherence loss in an 
interferometer. For the model in Fig. 13.18, the baseline for which the coherence 
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factor C, equal to measured/true visibility defined in Eq. (13.80), can be derived 
from Eq. (13.117), giving 


nc fay oh 
a= 100] - (=) | ; (13.118) 


27? 00 


For example, with o9 = 0.10 mm, ho = 2200m, h = 5000m, A = 1.3mm, and 
C = 0.9, de = 80m. 


13.3 Calibration via Atmospheric Emission 


A practical method of estimating phase fluctuations is to measure the integrated 
water vapor in the direction of each antenna beam. This usually requires an auxiliary 
radiometer at each antenna to measure the sky brightness temperature. Water vapor 
is the main cause of opacity at radio frequencies (except for the oxygen bands 
at 50-70 and 118 GHz), even at frequencies well away from the centers of water 
vapor lines, as can be seen in Fig. 13.7. Away from the centers of spectral lines, the 
opacity is due to the far line wings of infrared transitions. There is also an important 
continuum component of the absorption caused by water vapor, which varies as v? 
(Rosenkranz 1998). This component includes various quantum mechanical effects 
involving water molecules such as dimers (Chylek and Geldart 1997). It is usually 
necessary to model this component with an empirical coefficient. In addition, as 
described in Sect. 13.3.2, the water droplets in the form of clouds and fog, as well 
as ice crystals, contribute absorption that varies as v. Hence, there are two distinct 
methods of calibration: those based on measurement of sky brightness in the bands 
between the lines (continuum) and those based on measurements near a spectral line 
(see Welch 1999). The sensitivities of the brightness temperature to the propagation 
delay are listed in Table 13.5 for selected frequencies. 


13.3.1 Continuum Calibration 


The method of measuring the continuum sky brightness at, say, 90 or 225 GHz 
has several advantages, as first described by Zivanovic et al. (1995). The same 
radiometers used for the astronomical measurements can be used for the sky 
brightness measurements. At 225 GHz, if phase calibration to an accuracy of a 
twentieth of a wavelength is required, then, from the sensitivity listed in Table 13.5, 
the brightness temperature accuracy required is 0.1 K. For a system temperature of 
200K, this accuracy requires a gain stability of 5 x 1074. Such stability usually 
requires special attention to the temperature stabilization of the receiver cryogenics. 
In addition, the gain scales must be accurately calibrated. Changes in ground pickup 
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Table 13.5 Brightness temperature sensitivity dT,/dw (K/mm) for various frequencies at site 
elevations of 0 and 5 km for various values of precipitable water vapor* 


~0-km elevation | 5-km elevation 

v (GHz) Origin of opacity w = 0mm 15mm |w = 0mm 15 mm 
22.2 Line center (616 — 523) 1.9 1.7 2.8 2.8 
90.0 Continuum 1.8 2.1% 1.2 1.2 
183.3 Line center (313 — 229) 294 0.0 527 51.4 
185.0 Line wing (313 — 229) 222 0.1 280 91.2 
230.0 Continuum 15.9 L3 11.4 9.8 
690.0 Continuum 380 0.0 | 297 82.5 


aEntries in this table were calculated with the am model (Paine 2016) for median midlatitude 
atmospheric profiles. Note that w = 15 and 0mm are close to the measured values for midlatitudes 
for altitudes of 0 and 5 km, respectively. The effects of pressure broadening are clearly evident. For 
example, at w = 0, dT;,/dw is less at sea level than at 5 km for the 22.2- and 183.3-GHz spectral 
lines, while the opposite is true for the continuum bands. More detailed information about the 
brightness temperature sensitivity near the 183-GHz transition can be found in Fig. 13.21. 

>More sensitive than for w = 0 because of the effect of H20 line self-broadening. 
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Fig. 13.19 Correlation between interferometric phase predicted by total power measurement at 
230 GHz vs. interferometric phase. The data were taken over a period of 20 min on a 140- 
m baseline of the Submillimeter Array (SMA) on Mauna Kea. The total powers (i.e., antenna 
temperatures) at each antenna were used to estimate phase with a linear model having free 
parameters. The straight line shown has unity slope and zero intercept. The rms phase error is 
improved from 72 to 27°, corresponding to path length residuals of 260 to 98 um, respectively. 
From Battat et al. (2004). © AAS. Reproduced with permission. 


can be misinterpreted as sky brightness temperatures change. The presence of clouds 
defeats this method, because of the contribution of liquid water to the opacity. 
An example of viability of this type of calibration is shown in Fig. 13.19. The 
application of this method for the Plateau de Bure interferometer is described by 
Bremer (2002). For further discussion, see Matsushita et al. (2002). 
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13.3.2 22-GHz Water-Vapor Radiometry 


The idea of determining the vertical distribution of water vapor in the atmosphere 
from brightness temperature measurement at frequencies near the 22-GHz line was 
first investigated theoretically by Barrett and Chung (1962). It was further developed 
into a technique for determining the excess propagation path by Westwater (1967) 
and Schaper et al. (1970). To appreciate the degree of correlation between wet path 
length and brightness temperature, we need to examine the dependence of these 
quantities on pressure, water vapor density, and temperature. We consider here the 
interpretation of measurements near the 22.2-GHz resonance. The absorption coef- 
ficient given by Eq. (13.42) is complicated, but at line center it can be approximated 
by 


On = O36 tree HIT (13.119) 


where T is in kelvins, and we have neglected all except the leading terms in 
Eq. (13.42). We assume that the opacity given by Eq. (13.47) is small, so that the 
brightness temperature defined by Eq. (13.45) can be written 


[0,6] 
Tg ~ 178 f se dh, (13.120) 
0 


when we neglect the background temperature Tgq and any contributions from 
clouds. Recall that Eq. (13.16) shows that 


© py(h) 
Tih) 


Ly = 1763 x 10-* f (13.121) 
0 


Thus, if P and T were constant with height and equal to 1013mb and 280K, 
respectively, we could use Eq. (13.19), Ly ~ 6.3w, to obtain from Eq. (13.120) the 
relation Tg ~ 12.7w, where w is the column height of water vapor [see Eq. (13.18)]. 
Hence, to the degree of approximation used above, we obtain 


Tp (22.2 GHz) (K) ~ 2.1L£y (cm). (13.122) 


Note that this approximation is valid at sea level. Since, because of pressure 
broadening, the brightness temperature scales inversely with total pressure [see 
Eq. (13.120)], the coefficient in Eq. (13.122) is increased to 3.9 for a site at 
5000-m elevation, where the pressure is approximately 540 mb. Measurements of 
brightness temperature and path length estimated from radiosonde profiles show 
that Eq. (13.122) is a good approximation (see, e.g., Moran and Rosen 1981). Recall 
that py is approximately exponentially distributed with a scale height of 2 km. 
The temperature, on average, decreases by about 2% per kilometer. This change 
affects the proportionality between Tg and Ly only through the exponential factor 
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in Eq. (13.120) and the slight difference in the power law for temperature. Thus, 
temperature has a small effect. The pressure decreases by 10% per kilometer, so 
water vapor at higher altitudes contributes more heavily to Tg than is desirable 
for estimation by radiometry. The sensitivity of Tg to pressure is decreased by 
moving off the resonance frequency to a frequency near the half-power point of 
the transition. The reason for this is that as pressure increases, the line profile 
broadens while the integrated line profile is constant. Therefore, the absorption 
at line center decreases and the absorption in the line wings increases. Westwater 
(1967) showed that at 20.6 GHz, the absorption is nearly invariant with pressure. 
This particular frequency is called the hinge point. The opacity at this frequency is 
less than at the line center, so the nonlinear relationship between Tg and opacity is 
less important. 

The foregoing discussion assumes that measurements of Tg are made in clear- 
sky conditions. The water droplet content in clouds or fog causes substantial 
absorption but small change in the index of refraction compared with that of 
water vapor. Fortunately, the effect of clouds can be eliminated by combining 
measurements at two frequencies. In nonprecipitating clouds, the sizes of the 
water droplets are generally less than 100 um, and at wavelengths greater than 
a few millimeters, the scattering is small and the attenuation is due primarily to 
absorption. The absorption coefficient is given by the empirical formula (Staelin 
1966) 


100-0122291-7) 
Œclouds ~ 2r (m7') , (13.123) 


where pz is the density of liquid water droplets in grams per cubic meter, À is 
the wavelength in meters, and T is in kelvins. This formula is valid for A greater 
than ~ 3 mm where the droplet sizes are small compared with A /(27). For shorter 
wavelengths, the absorption is less than predicted by Eq. (13.123) (Freeman 1987; 
Ray 1972). A very wet cumulus cloud with a water density of 1 g m~? and a size 
of 1km will have an absorption coefficient of 7 x 107° m7! and will therefore 
have a brightness temperature of about 20 K at 22 GHz. The index of refraction of 
liquid water is about 5 at 22 GHz for T = 280K (Goldstein 1951). The actual 
excess propagation path through the cloud due to liquid water would be about 
4mm, but the predicted excess path from Eq. (13.122) is 10cm. Thus, the brightness 
temperature at a single frequency cannot be used reliably to estimate the excess path 
length when clouds are present. In order to eliminate the brightness temperature 
contribution of clouds, measurements must be made at two frequencies, vı and v2, 
one near the water line and one well off the water line, respectively. The brightness 
temperature is 


Tgi = Tayi + Tec: » (13.124) 


where Tgy; and Tgci are the brightness temperatures due to water vapor and 
clouds at frequency i. Here we neglect the effects of atmospheric O2. Since, from 
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Eq. (13.123), Tgc x v?, we can form the observable 


vi vi 
Tgi — Te = = Tayi — Tava (13.125) 
V3 v 


which eliminates the effect of clouds. The correlation between Tgyı — Tgy2 X 
v? / v2 and Ly can be estimated from model calculations based on Eqs. (13.45) 
and (13.16). The off-resonance frequency vz is generally chosen to be about 
31 GHz. The problem of finding the two best frequencies and the appropriate 
correlation coefficients to use in predicting Ly has been widely discussed (West- 
water 1978; Wu 1979; Westwater and Guiraud 1980). The liquid content of 
clouds can also be measured by dual-frequency techniques [see, e.g., Snider et al. 
(1980)]. 

The application of multifrequency microwave radiometry to the calibration of 
wet path length has been described by Guiraud et al. (1979), Elgered et al. (1982), 
Resch (1984), Elgered et al. (1991), and Tahmoush and Rogers (2000). A high- 
performance receiver design is discussed by Tanner and Riley (2003). The results 
show that Ly can be estimated to an accuracy better than a few millimeters. 
This is useful for calibrating VLBI delay measurements and extending coherence 
times. Measurements of Tg at the antennas of short-baseline interferometers can be 
useful in correcting the interferometer phase. More accurate predictions of Ly, or 
interferometer phase, can be obtained by including measurements in other bands. 
For example, measurements of the wings of the terrestrial oxygen line near 50 GHz 
can be used to probe the vertical temperature structure of the troposphere [see, 
e.g., Miner et al. (1972), Snider (1972)]. The accuracy of these schemes has been 
analyzed by Solheim et al. (1998). 

The observation of the 22-GHz line provides a calibration technique that is 
not sensitive to gain variations and ground pickup. Multiple frequencies can be 
monitored to correct for clouds and the variable distribution of water vapor with 
height [see Eq. (13.125)]. For millimeter observations at moderately dry sites, the 
22-GHz line may be the best choice [see Bremer (2002) for a description of the 
system operating at Plateau de Bure]. An example of phase correction based on this 
line is shown in Fig. 13.20. 


13.3.3 183-GHz Water-Vapor Radiometry 


For observations at very dry sites, the 183-GHz line may give better results (Lay 
1998; Wiedner and Hills 2000). The 183-GHz line is intrinsically about 30 times 
more sensitive than the 22-GHz line. However, the 183-GHz line is much more 
easily saturated (i.e., its opacity exceeds unity) than is the 22-GHz line. 

A phase-correcting system utilizing the 183-GHz lines was developed for 
ALMA, and its application is described by Nikolic et al. (2013). Each telescope of 
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Fig. 13.20 The interferometric phase (in units of delay) measured at 3-mm wavelength on one 
baseline of the interferometer at Owens Valley Radio Observatory (solid line), and the delay 
predicted by 22-GHz water-vapor radiometer measurements (dotted line), vs. time. The rms 
deviation of the difference is 160 um. The source is 3C273. From Welch (1999), with kind 
permission from and © URSI; see also Woody et al. (2000). 


the array is equipped with a boresighted radiometer having four channels sampling 
parts of the 183-GHz line profile. The radiometers are double-sideband, and the 
four channels are symmetrically placed around line center at offsets of 0.5, 3.1, 5.2, 
and 8.3 GHz. Theoretical line profiles are shown in Fig. 13.21 for various values of 
precipitable water vapor, w [see Eq. (13.18)]. For low values of w, e.g., 0.3 mm, the 
maximum sensitivity, dTg/dw, is obtained at line center. This sensitivity decreases 
to zero as the line saturates. The channels away from line center become more 
important as w increases. Combining the measurements with appropriate statistical 
weights allows accurate estimates of the propagation path over a wide range of 
conditions. The actual sensitivity coefficients are derived empirically. An example 
of the reduction in phase noise is shown in Fig. 13.22. 

The system does not work well in the presence of clouds, which add a brightness 
temperature contribution ~ v? [see Eq. (13.123)]. Separate measurements on either 
side of the line could allow the estimation of the cloud contribution. At low levels 
of w, an unmodeled contribution to the fluctuations is detected that is attributed 
to fluctuation in the dry-air component of refractivity. Ancillary measurements of 
total pressure at each antenna may allow the effects of these fluctuations to be 
corrected. 

The 183-GHz line has been used to estimate w by the atmospheric remote sensing 
community [e.g., Racette et al. (2005)]. 
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Fig. 13.21 (top) Theoretical brightness temperature profiles for the water vapor transition cen- 
tered at 183.3 GHz appropriate for a site at 5000-m altitude for six values of w, the water vapor 
column density: (from bottom to top) 0.3, 0.6, 1, 2, 3, and 5mm. The small blip noticeable at 
184.4 GHz is the 109 jo—9; 9 ozone transition originating in the upper atmosphere, where pressure 
broadening is small. The brightness temperature profiles become increasingly saturated at the 
atmospheric temperature as w increases [see Eq. (13.48)]. (bottom) The change of brightness 
temperature with water vapor column density, dTg/dw, for the same values of w (from top to 
bottom). At w = 5mm, the brightness temperature sensitivity to a change in water vapor column 
density is essentially zero at line center and reaches a broad maximum about 5 GHz from line 
center. From B. Nikolic et al. (2013), reproduced with permission. © ESO. 
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Fig. 13.22 The rms phase (in microns) deviation (square root of the phase structure function, Dg) 
vs. projected baseline length. Measurements were made on the source 3C138 at 230GHz in a 
period of 15 min. The water vapor column density was 1.4mm, and the surface wind speed was 
7ms_!. The circles show the uncalibrated results. The three-part power law is a suggestive fit to the 
data. The break at 670 m marks the transition from 3-D to 2-D turbulence and indicates a thickness 
to the turbulent layer of about 2km [see Eq. (A13.17)]. The break at 3km indicates the outer 
scale of the turbulence. The triangles show the rms phase deviation after water-vapor radiometer 
corrections. The squares show the rms phase deviations after phase referencing to a calibrator 
source offset by 1.3° (target/calibrator cycle time was 20s). Adapted from ALMA Partnership 
et al. (2015). 


13.4 Reduction of Atmospheric Phase Errors by Calibration 


Atmospheric phase errors can be treated like antenna-based phase errors in consid- 
ering their effect on an image. In Sect. 11.4, it is shown that the dynamic range of a 
snapshot image is approximately 


Na (na — 1) 
Pirms 


where ms is the rms of the phase error in radians measured with pairs of antennas, 
and n, is the number of antennas. For example, if rms is 1 rad and na = 30, the 
dynamic range is ~ 30. As a rough guide, the range of ¢ms from 0.5 to 1 rad 
represents array performance from fair to marginal. The improvement in the image 
with longer integration depends on the spectrum of the phase fluctuations. 

For phase calibration at centimeter wavelengths, it is common to observe a phase 
calibrator at intervals of ~ 20-30 min. At millimeter wavelengths, this is generally 
not satisfactory, because of the much greater phase fluctuations resulting from the 
atmosphere. Procedures that can be used at millimeter and submillimeter wave- 


(13.126) 
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lengths to reduce the effect of atmospheric phase fluctuations are described below. 
These methods are analogous to those of adaptive optics at optical wavelengths. 


Self-Calibration. The simplest way to remove the effects of atmospheric phase 
fluctuations is to use self-calibration, as described in Sects. 10.3 and 11.3. This 
method depends on phase closure relationships in groups of three or more antennas. 
In applying this method, it is necessary to integrate the correlator output data for 
a long enough time that the source can be detected; that is, the measured visibility 
phase must result mainly from the source, not the instrumental noise. However, the 
integration time is limited by the fluctuation rate, so self-calibration is not useful for 
sources that require long integration times to detect. 


Frequent Calibration (Fast Switching). | Frequent phase calibration using an 
unresolved source close to the target source (the source under study) can greatly 
reduce atmospheric phase errors (Holdaway et al. 1995b; Lay 1997b). To ensure 
that the atmospheric phase measured on the calibrator is close to that for the target 
source, the angular distance between the two sources must be no more than a few 
degrees. The time difference must be less than ~ 1 min, so fast position switching 
between the target source and the calibrator is required. In the layer in which most of 
the water vapor occurs, the lines of sight from the antennas to the target source and 
the calibrator pass within a distance di of one another. For a nominal screen height 
of 1 km, de ~ 170, where 0 is the angular separation in degrees and d is in meters. 
For one antenna, the rms phase difference between the two paths is /D¢(dt-) at 
any instant. If feyc is the time to complete one observing cycle of the target source 
and the calibrator, then the mean time difference between the measurements on 
these two sources is feyc/2. In time frye /2, the atmosphere will have moved Usfeye/2. 
Thus, the phase difference between the measurements on the two paths is effectively 
Dg (dtc + Ustcyc/2). This is a worst-case estimate, since we have taken the scalar sum 
of vector quantities corresponding to dj, and vs. For the difference in the paths to the 
two antennas as measured by the interferometer, the rms value will be »/2 times that 
for one antenna, so the residual atmospheric phase error in the measured visibility is 


Pirms = y 2D¢6(dic + Usteye/ 2) . (13.127) 


Note that dms is independent of the baseline, so the phase errors should not increase 
with baseline length. The total time for one cycle of observation of the two sources 
is the sum of the observing times on the target source and the calibrator, plus twice 
the antenna slew time between the sources and twice the setup time between ending 
the slew motion and starting to record data. The observing times required on each 
of the sources depend on the flux densities and the sensitivity of the instrument. 
For the calibrator, there may be a choice between a weak source nearby and a 
stronger one that requires less observing time but more antenna slew time. In order 
to use calibration sources as a general solution to the atmospheric phase problem, 
suitable calibrators must be available within a few degrees of any point on the sky. 
Since calibrator flux densities generally decrease with increasing frequency, it may 
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Fig. 13.23 The square root of the phase structure, that is, the rms phase deviation vs. baseline 
length, for data taken at the VLA at 22 GHz for various averaging times. These data show the 
effectiveness of fast switching. In these measurements, the target source and calibrator source were 
the same, 0748+240. The solid squares (labeled 5400 sec) show the rms phase fluctuations with no 
switching (same data as in Fig. 13.13). The circles and the stars show the rms phase deviation 
for cycle times 300s and 20s, respectively. From Carilli and Holdaway (1999). © 1999 by the 
American Geophys. Union. 


be necessary to observe the calibrator at a lower frequency than is used for the 
target source. The measured phase for the calibrator must then be multiplied by 
Vsource/ Veal (Since the troposphere is essentially nondispersive) before subtraction 
from the target source phases, so the accuracy required for the calibrator phase 
is increased. Thus, the observing frequency for the calibrator should not be too 
low; a frequency near 90 GHz may be a practical choice with observations of the 
target source up to a few hundred gigahertz. The effectiveness of the fast-switching 
technique is demonstrated by the data in Fig. 13.23. Note that the break in the curve 
for the 300-s averaging time at antenna spacing 1500 m indicates that the wind speed 
was about 2x 1500/300 = 10m s™! (Carilli and Holdaway 1999). The effectiveness 
of fast switching for ALMA is described by Asaki et al. (2014). 


Paired or Clustered Antennas. Location of antennas in closely spaced pairs is 
an alternative to fast movement between the target source and the calibrator. One 
antenna of each pair continuously observes the target source and the other observes 
the calibrator. With this scheme, feyc is zero in Eq. (13.127), but the spacing of the 
paired antennas, d,, should be included. The rms residual atmospheric error in the 
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visibility phase becomes 


Prms = y 2D¢4 (d + dp) . (13.128) 


As in Eq. (13.127), rms is a worst-case estimate, since we have taken a scalar sum of 
vector quantities corresponding to d and dp. For a 2° position difference between 
the target source and the calibrator, and an effective height of 1 km for the water 
vapor, de = 35 m. For antennas of diameter ~ 10 m, which is typical for antennas 
operating up to 300 GHz, d, should be about 15 m to avoid serious shadowing, and 
this is smaller than Vsfeyc/2 for the fast-switching scheme, since vs is typically 6- 
12m s™! and feyc is 10s or more. Thus, with paired antennas, the residual phase 
errors are somewhat less than with fast switching. Also, observing time is not 
wasted during antenna slewing and setup. However, with fast switching, about half 
of the time is devoted to the target source, whereas with paired antennas, half of the 
antennas are devoted to the target source, so in the latter case, the sensitivity is less 
by a factor ~ /2. In some cases, the paired antennas are available for use in an 
array. If the “science array” and the “reference array” are separate, there is no loss 
of capability in the “science array.” Demonstration of the technique for the VLA is 
described by Carilli and Holdaway (1999) and for the Nobeyama Radio Observatory 
by Asaki et al. (1996). Another example is the CARMA array of 6-m- and 10-m- 
diameter antennas. The reference array is comprised of 3.5-m-diameter antennas. 
This system is described by Peréz et al. (2010) and Zauderer et al. (2016). 


Appendix 13.1 Importance of the 22-GHz Line in WWII 
Radar Development 


The history of the 22-GHz transition of water vapor is quite interesting. The water 
vapor molecule has different moments of inertia about its three axes of rotation, 
and hence its rotational spectrum is complex, as shown in Fig. A13.1. The rotational 
energy levels were first determined through measurements of the infrared spectra 
by Randall et al. (1937). Van Vleck noted in an MIT Radiation Laboratory report 
(Van Vleck 1942) that these energy levels indicated the existence of an allowable 
microwave line in the range of 1.2—1.5cm (20-25 GHz), due to a chance near- 
coincidence of two energy levels in adjacent rotational ladders. Lying at an energy 
level of 447 cm™! above the ground state, corresponding to a temperature of 640K, 
the line has a Boltzmann population factor at atmospheric temperature that is about 
0.1. Van Vleck calculated that atmospheric opacity along horizontal path lengths 
due to the absorption of H2O and absorption in the wing of the O» lines near 
60 GHz would cause problems for radars operating at short centimeter wavelengths. 
However, there was little empirical data about the pressure-broadening constants 
for the line widths [see Eq. (13.43) and Fig. 13.7] of these lines, and the estimates 
Van Vleck used were almost three times larger than than their actual values. 
Therefore, he substantially overestimated the absorption of O, and underestimated 
the absorption of H20 at 1.25-cm wavelength. Nonetheless, he raised an important 
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Fig. A13.1 Energy levels of the ground vibrational state of H2O, an axisymmetric rotating 
molecule. The quantum numbers are denoted Jx—1x+1. K — 1K + 1 are even/odd and odd/even 
for ortho states and even/even and odd/odd for para states. The seven most important transitions 
responsible for making the atmosphere opaque at ALMA for a water vapor pressure of 1 mm and 
frequencies less than 1 THz (see Fig. 13.14) are marked with their frequencies (380, 448, 557, 621, 
752, 920, and 987 GHz) along with the ground state transition at 1113 GHz. The diagnostic lines at 
22 and 183 GHz used in water-vapor radiometry (Sects. 13.3.2 and 13.3.3) are shown with dotted 
lines. Other molecular lines causing high opacity are O% at 60 GHz and O; at 118 GHz. Data from 
Splatalogue (2016). 


concern, which was to go unheeded. His later absorption estimates (Van Vleck 1945, 
1947) were more accurate. 

Late in World War II, the 3-cm airborne radar had proved to be highly successful. 
To obtain even higher resolution for antennas of similar size, a new system at 
1.25 cm was planned as more powerful microwave signal sources became available. 
Van Vleck and Townes warned that the new system would have difficulties because 
of water vapor absorption (see Townes 1952, 1999; Buderi 1996; and Sullivan 
2009), but development proceeded nonetheless. The range of the new system, 
looking along horizontal paths, was found to be only typically 20km or less, a 
tremendous disappointment. The cause was quickly traced to atmospheric water 
vapor absorption. Dicke et al. (1946) traced out the line shape from atmospheric 
brightness temperature measurements in Florida in 1945 and established the 
wavelength to be 1.34cm (v = 22.2 GHz) and also determined the line profile 
and absorption coefficient. Planned deployment of the system to the moist South 
Pacific war zone was canceled. Townes and Merritt (1946) measured the transition 
at low pressure in the lab to high accuracy (v = 22237 + 5 MHz, 1.349 cm). The 
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modern standard frequency of the transition, weighted over its hyperfine transitions, 
is 22235.080 MHz (Kukolich 1969). 


Appendix 13.2 Derivation of the Tropospheric Phase 
Structure Function 


The purpose of this appendix is to derive the phase structure function for the 
troposphere from the structure function for the index of refraction in a turbulent 
medium. We follow the derivation of Tatarski (1961). 

The structure function of phase is defined as 


Dg = ([bi a1) — ¢202)P) . (A13.1) 


where x; and x2 are two measurement points as shown in Fig. A13.2, which for 
our purposes form a baseline interferometer normal to the incoming propagation 
direction as viewed outside the homogeneous scattering layer of thickness L. The 
turbulence is considered to be “frozen” as it moves along the x axis. The ensemble 
average is usually approximated as a time average of duration T, where T is much 
longer than the crossing time of the turbulent cells, i.e., T >> d/vs, where vs 
is the wind speed component along the baseline. The initially flat phase front is 
distorted by the turbulent medium, as shown in the right side of Fig. A13.2. The 
phase structure function depends only on the separation d = |x, — x2|. The structure 
function of the index of refraction for Kolmogorov turbulence has the general form 


D, = Cr’, (A13.2) 


where r is the vector separation between two points in the turbulent medium. 
With the assumption that the medium is isotropic and homogeneous, the structure 
function becomes a function of only the scalar separation, r, 


Dy, = Êr’. (A13.3) 


$œ y =L) 


g CP 
> OD tn Sp o0 
= a; MAAMA oe 9=0) 


Fig. A13.2 (left) Cartoon of a frozen turbulent layer of the troposphere moving along the x axis 
at velocity vs. The structure function is measured at two points, x; and x2. (right) The incoming 
signal phase from a point source at the bottom and top of the scattering layer. 
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From a strict ray-tracing calculation, the phases at some instant will be given by 


Qn f} 
Qı = ce n(y, xı) dy 
g (A13.4) 
Qn f} 
h = pa n(y, x2) dy , 
0 


where n is the index of refraction along the y axis perpendicular to the baseline, 
and A is the wavelength. Except at very dry sites, the fluctuations in refraction are 
primarily caused by variation in the water vapor density. The difference in phase at 
points x; and x2 is therefore 


L 
di-h2 = =| [n(y, x1) — ny, x2)] dy , (A13.5) 
0 


and the squared difference of phase is 
2r)? fe 
($1 — $2)? = (=) f [nYa x1) — nYa, X2)] dya 
0 


L 
xf ns, x1) — NO», X2)] dyb , (A13.6) 


or 


2 2 L L 
(1-0) = (F) j) Í Gai E 


x [n(yp, x1) — np, X2)] dYa dyp . (A13.7) 


We could expand the integrand in Eq. (A13.7) into cross products of n at different 
positions. However, we prefer to express the final result in terms of structure 
functions rather than correlation functions. To proceed, we use the algebraic identity 


(a—b)(c—d) = aC — d}? + (b — c}? — (a — c}? — (b — d)?] . (A13.8) 


Substituting Eq. (A13.7) into Eq. (A13.1), making use of Eq. (A 13.8), and taking the 
expectation term by term, we obtain 


2 L L 
Doi =5 (=) f f (oeno 


+ (a, x2) — np, x1)]’) 
— ([n(va. x1) — nb, x1 )) 


—([n(a, x2) — nOs) dya dyp . (A13.9) 
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The four terms in Eq. (A13.9) are structure functions of the index of refraction for 
various separations, as defined in Eq. (A13.3). Note that the separation in the first 
two terms is [(ya — yp)? + (xı —x2)]'/, while the separation in the second two terms 
is |Ya — yp|. Hence, the structure function of phase can now be written as 


D¿(d) = (2) [ [ |D, (v (Ya — yo)? + xı =x") 
=D, (lya -y»))] dya dyp . (A13.10) 


The integral in Eq.(A13.10) can be simplified because the arguments of D, are a 
function of only ya — yp. Note that an integral of the form 


L pL 
r] | f00-s)dvodn (A13.11) 


can be simplified (see Fig. A13.3) by a change in variables to y = ya — yp and yp. 
For the case in which f (ya — yp) is an even function, Eq. (A13.11) becomes 


L 
I= af (L—y)f(y) dy. (A13.12) 


By use of this relation, the structure function of phase becomes 


Dg(d) = 2 (2) fa» |D, (v> F d) - D,0)] dy. (A13.13) 


Substitution of Eq. (A13.3) into Eq. (A13.13) gives 


2n\? L 
Dg(d) = 2 (=) C, f L-O +P) -y?] dy. (A13.14) 
0 
Yb Yb 
L L 
i A 
L Ya L L Ya 


Fig. A13.3 The change in integration variables from y4, yp to y, Yp, where y = ya — yp for the 
derivation of Eq. (A13.12). 
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This equation is the general starting point for most discussions (see Tatarski 1961, 
eq. 6.27). The distinction of two major regimes, d < L and d > L, was first made 
in the context of radio interferometry by Stotskii (1973, 1976) and further discussed 
by Dravskikh and Finkelstein (1979) and Coulman (1990). 

The case d < L is called the “three-dimensional,”’ or 3-D, turbulence solution. 
The integral in Eq.(A13.14) is maximum at y = 0, where it equals Lda?’ and 
decreases monotonically to zero as y increases. It declines slowly for y < d, and 
for larger y, it decreases as y~*/3. Hence, the integrand is approximately constant 
in the range of y from 0 to d, and most of the contribution to the integral is within 
this range. Thus, from Eq. (A13.14), Dg ~ Ld?’ x d ~ Ld?’ . The proportionality 
constant, as reported by Tatarski (1961, eq. 6.65), based on analytic approximation, 
is about 2.91. Hence, 


2n\? 27 75/3 
Dg(d) = 2.91 T CG La”? , dr, ,dn<d L. (A13.15) 


The case d > L is called the “two-dimensional,” or 2-D, turbulence case. Stotskii 
(1973) and Coulman (1990) presented the reasons why Eq. (A13.14), strictly valid 
for isotropic turbulence, can be used for this case. When d > L, the argument in 
brackets in Eq. (A13.14) becomes ~ d?/3 — y*/3, and Eq. (A13.14) can be integrated 
directly. The leading term in the integral is $l?! 3, which gives 


2 2 
D,(d) ~ (=) Cre! | LX d< Lou. (A13.16) 


For d > Lou, Dn becomes independent of distance, and Dg becomes flat. 
It is interesting to note that the two structure functions given in Eqs. (A13.15) 
and (A13.16) intersect at a distance 


d = L/2.9, (A13.17) 


which can be taken to be the nominal transition point from 3-D to 2-D turbulence. 
For a scale height of 2km, this would be about 700m. Numerical integrations of 
Eq. (A13.14) have been done by Treuhaft and Lanyi (1987). An example of such 
an integration is shown in Fig. A13.4. Note that the transition from the 2-D to 3-D 
structure functions is rather gradual. This probably explains why large variations in 
the power-law index have been reported from observational data. 

The results described above can be generalized for the case in which the 
propagation angle is not normal to the baseline but rather is at an angle y. In 
this situation, L is replaced by Lsecy for a plane-parallel atmosphere. Thus, 
Eqs. (A13.15) and (A13.16) show that the structure functions vary as sec y and sec? y 
for the 3-D and 2-D cases, respectively. 
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Fig. A13.4 (bottom) Structure function of phase vs. baseline length (d) and its power-law 
approximations for a layer thickness of 2 km and turbulence parameter C? = 1. The intersection 
of the two power-law components, which occurs at d = L/2.9 = db, or about 700 m, is marked 
by the thin vertical line. (top) Power-law index as a function of baseline for the structure function 
of phase. 
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Chapter 14 
Propagation Effects: Ionized Media 


Three distinct ionized media, or plasmas, affect the propagation of radio signals 
passing through them: the Earth’s ionosphere; the interplanetary medium, also 
known as the solar wind; and the interstellar medium of our Galaxy. The effects 
of scattering in other galaxies or in the media between galaxies are not usually 
important. There are several essential differences between neutral and ionized media 
with regard to propagation. For neutral media, the index of refraction is greater 
than unity and is unaffected by magnetic fields. In ionized media, the index of 
refraction is less than unity and is strongly affected by magnetic fields. Most plasma 
phenomena scale as v~’, and their effects can be avoided or mitigated, if desired, by 
observations at high frequency. Absorption plays an important role in neutral media 
but very little in ionized media since most radio astronomical observations occur at 
frequencies far above the plasma frequency. Descriptions of scattering phenomena 
in both types of media are based on Kolmogorov theory. However, the situation in 
the neutral troposphere is greatly simplified because the turbulent layer lies close to 
the observer, and only phase fluctuations develop. The ionized media lie far from 
the observer, and both phase and amplitude fluctuations are often present in the 
wavefront when it reaches the observer. 


14.1 Ionosphere 


The ionosphere has been studied extensively since the pioneering experiments of 
Appleton and Barnett (1925) and Breit and Tuve (1926). The literature on the subject 
is vast. Magneto-ionic propagation theory relevant to the ionosphere is treated in 
depth by Ratcliffe (1962) and Budden (1961); the general physics and chemistry 
of the ionosphere is described by Schunk and Nagy (2009); and an excellent 
general treatment of ionospheric propagation is given by Davies (1965). Reviews 
of particular relevance to radio astronomy can be found in Evans and Hagfors 
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Table 14.1 Maximum likely values of ionospheric effects at 100 MHz for a zenith angle* of 60° 


Maximum? Minimum’ Frequency 

Effect (Daytime) (Night) dependence 
Faraday rotation 15 rotations 1.5 rotations vy? 
Group delay 12 us 1.2 us v7? 
Excess (phase) path length 3500 m 350 m vy? 
Phase change 7500 rad 750 rad vl 
Phase stability 

(peak to peak) +150 rad +15 rad vo! 
Frequency stability (rms) +0.04 Hz 0.004 Hz vol 
Absorption 

(in D and F regions) 0.1 dBf 0.01 dB v7? 
Refraction (ambient) 0.05° 0.005° vy? 
Isoplanatic patch - ~5° v 


Adapted from Evans and Hagfors (1968). 
“For values of parameters at the zenith, divide numbers (except refraction) by sec z;, which 
is approximately 1.7 [see Eq. (14.14)]. For typical (rather than maximum) parameters, divide 
numbers by 2. 

2 


>Total electron content = 5 x 10!7 m~?. 


Total electron content = 5 x 10!® m7?. 


d1 dB = 0.230 nepers. 


(1968) and Hagfors (1976). Beynon (1975) gives interesting historical anecdotes 
on the early development of ionospheric research. In this section, we treat only 
those aspects of the ionosphere that have a deleterious effect on interferometric 
observations. Table 14.1 gives the magnitude of various propagation effects for the 
daytime and nighttime ionosphere. Most of these effects scale as v~?, and they 
can be minimized by observing at higher frequencies. For small zenith angles, 
the magnitude of the ionospheric excess path typically equals that of the neutral 
atmosphere at approximately 2 GHz, but the frequency of this equality can vary 
from about | to 5 GHz. Thus, at 20 GHz and small zenith angles, the ionospheric 
excess path length is typically only 1% of the tropospheric excess path length. 
However, at very large zenith angles, i.e., near the horizon, the effects are equal 
at about 300 MHz. 


14.1.1 Basic Physics 


The ionization of the upper atmosphere is caused by the ultraviolet radiation from 
the Sun. Typical daytime and nighttime vertical profiles of the electron density are 
shown in Fig. 14.1. The electron distribution and the total electron content vary also 
with geomagnetic latitude, time of year, and sunspot cycle. There are also substantial 
winds, traveling disturbances, and irregularities in the ionosphere. The ionosphere is 
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Fig. 14.1 Idealized electron density distribution in the Earth’s ionosphere. The curves indicate the 
densities to be expected at sunspot maximum in temperate latitudes. Peak sunspot activity occurs 
at 11-year intervals, most recently peaking in 2001 (cycle 23) and ~ 2012 (cycle 24). Cycle 24 was 
about a year late and rather weak [see Janardhan et al. (2015)]. From J. V. Evans and T. Hagfors, 
Radar Astronomy, 1968. © McGraw-Hill Education. 


permeated by the quasi-dipole magnetic field of the Earth. Propagation is governed 
by the theory of waves in a magnetized plasma with collisions. 

We derive some of the fundamental properties of the ionosphere related to 
the propagation of electromagnetic waves by considering elementary cases. First, 
consider a plane monochromatic wave of linear polarization that propagates through 
a uniform plasma of electron density ne, where the magnetic field and collisions 
between particles can be neglected. The electrons oscillate with the electric field, 
but the protons, because of their greater mass, remain relatively unperturbed. The 
index of refraction can be found by calculating either the induced current or the 
dipole moment. Either method yields the same result. We use the latter method, as 
we did when considering the index of refraction of water vapor using the bound 
oscillator model in Sect. 13.1.4. The equation of motion of a free electron in the 
plasma is 


mš = —eEye ?™™ , (14.1) 


where m, e, and x are the mass, charge magnitude, and displacement of the electron, 
and Epo and v are the amplitude and frequency of the electric field E of the incident 
wave. The magnetic field of the plane wave has negligible influence on the electrons 
as long as the electron velocity is much less than c, and the electric field has 
negligible influence on the motion of the protons. The steady-state solution to 
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Eq. (14.1) is 


K = Eye | (14.2) 
(27v)*m 

Note that the induced current density is i= nex, where x, the velocity of the particle, 
is 90° out of phase with the driving electric field. Thus, the work done by the wave 
on the particles, which is (i-E), is zero, and the wave propagates without loss, as 
expected, since Eq. (14.1) has no dissipative terms. The dipole moment per unit 
volume P is equal to neeXo, where Xo is the amplitude of oscillation. The dielectric 
constant £ is 1 + (P/E,)/€o, where €o is the permittivity of free space, so that 


nee” 


= | — ———. 14.3 
s 47r?v?eom ( ) 


The dielectric constant is real and less than unity because the induced dipole is 180° 
out of phase with the driving field. 

The index of refraction n is equal to the square root of £, and in this case is real, 
so 


n=,1-—, (14.4) 


where 


e Ne 
„= £ |Z x9 m (H2, 14.5 
Vy 2x \ em ne (Hz) ( ) 
3 


and ne is in m~°. vp is known as the plasma frequency, which is also the natural 
frequency of mechanical oscillations in the plasma [see, e.g., Holt and Haskell 
(1965)]. The plasma frequency of the ionosphere (see Fig. 14.1) is usually less 
than 12 MHz. Waves normally incident on a plasma with frequencies below v, are 
perfectly reflected. The phase velocity of a wave with v > v, in the plasma is c/n, 
which is greater than c, and the group velocity of a wave packet is cn, which is less 
than c. 

Now consider a plasma with a static magnetic field B in the direction of 
propagation of the plane wave. The vector equation of motion of an electron, called 
the Lorentz equation, is 


mý = —e [E + v x B] , (14.6) 


where v is the vector velocity. Let the incident field be a circularly polarized wave. 
If B is zero, the particle will follow the tip of the electric field vector and move in 
a circular orbit. If B is nonzero, the sum of the v x B force term, which will be 
in the radial direction, and the electric force term must be balanced by centripetal 
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acceleration. Thus, there is a basic anisotropy in the plasma depending on whether 
the wave is right or left circularly polarized, since the sign of the v x B term changes 
between the two cases. The radius R, of the circular orbit of the electron is derived 
from the balance-of-forces equation eEy +evB = mv?/R,, where the scalar velocity 
v = 2x vR, B is the magnitude of the magnetic field, and the upper and lower signs 
refer to left and right circular polarization, respectively. Thus, we obtain 


eEo 


Re = ——-———_. . 
° An2mv2 F 2nveB 


(14.7) 


Following the same procedure as the one described below Eq. (14.2), we find that 
the index of refraction is given by the equation 


v2 


n=1—-—_/_., 14.8 
v(v F vg) ( ) 


where vz is the gyrofrequency, or cyclotron frequency, given by 


eB 
Vg = 


— 27am” 


(14.9) 


The gyrofrequency is the frequency at which an electron would spiral around a 
magnetic field line in the absence of any electromagnetic radiation. In the absence 
of damping, Re would go to infinity if the applied electric field frequency were vg. 
The gyrofrequency of the Earth’s magnetic field in the ionosphere (~ 0.5 x 1074 
tesla) is about 1.4 MHz. 

Equation (14.8) gives the index of refraction for the case of a longitudinal 
magnetic field, that is, where the field is parallel to the direction of wave prop- 
agation. The solution for the transverse case is different. The solution for the 
quasi-longitudinal case is obtained by replacing B with B cos 0, where @ is the angle 
between the propagation vector and the direction of the magnetic field. The quasi- 
longitudinal solution is applicable when the angle @ is less than that specified by the 
inequality (Ratcliffe 1962) 


1 v 
-sin tan < B (14.10) 
2 vvB 


When v > 100 MHz, v, ~ 10 MHz, and vg ~ 1.4 MHz, the quasi-longitudinal 
solution is valid for |0| < 89°, or virtually all cases of interest. Therefore, to a high 
accuracy, when v > (v, and vg), we can expand Eq. (14.8) to obtain 


ey ee 


a ar cos@ , (14.11) 
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where we neglect terms in v* and higher order. For propagation in the direction of 
B, the index of refraction is lower for a left circularly polarized wave than for a right 
circularly polarized wave. 

The difference in the index of refraction for right and left circularly polarized 
waves leads to the important phenomenon of Faraday rotation, whereby a linearly 
polarized wave has its plane of polarization rotated as it propagates through the 
plasma. A linearly polarized wave with position angle y can be decomposed into 
right and left circularly polarized waves of equal amplitude and phase difference 2y. 
The phase of the two circular waves as they propagate in the y direction through a 
plasma is 27 vn,y/c and 2x vney/c, where n, and nz are the indices of refraction for 
the right circular and left circular modes, respectively. The phase difference between 
the waves is 27 v(n, — ne)y/c. From Eq. (14.11), n, — ne = vvv? cos 6, so it is 
clear that the plane of polarization is rotated by the angle 


TT 


Ay = / vp vg cos 0 dy , (14.12) 


cv? 


where vp, vg, and 0 may be functions of y. 
For constant magnetic field and electron density, Eq. (14.12) can be written 


Aw = 2.6 x 10° 8n.BA7L cos 6 , (14.13) 


where Ay is in radians, ne is in m~°, B is in tesla and is positive when the field is 
pointed toward the observer, À is the wavelength in meters, and L is the path length 
in meters. A magnetic field pointed toward the observer causes the position angle 
to increase (i.e., a counterclockwise rotation of the plane of polarization of incident 
radiation as viewed from the surface of the Earth). 


14.1.2 Refraction and Propagation Delay 


The situation with ionospheric refraction is different from that of tropospheric 
refraction. The latter occurs in a layer within about 10 km from the ground, and 
most effects can be understood, at least to first order, with a model of a plane-parallel 
medium. Since the index of refraction is slightly larger than unity, incoming rays are 
bent toward the zenith. In contrast, the layers responsible for ionospheric refraction 
occur several hundred kilometers above the surface, as shown in Fig. 14.2. If the 
ionosphere were modeled as a plane-parallel layer, then an incoming ray at a certain 
zenith angle would be bent away from the normal upon entering the layer and then 
bent back by an equal amount when exiting below. In this case, there would be no 
net change in the zenith angle. However, the Earth’s curvature, in combination with 
the index of refraction of less than unity, results in a net deflection toward the zenith, 
the same sense as for tropospheric refraction. A concept that is especially important 
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Fig. 14.2 Diagram of a ray passing through a homogeneous but exaggerated ionospheric layer 
from height h; to h; + Ah. Because of the curvature of the Earth, z; = z2+62 Æ z2, the net bending 
angle Az = z— zo iS positive, as it is for the troposphere (see Appendix 14.1 for a derivation of Az 
for this single-layer model). Note that if z2 > 90, there will be total internal reflection, and the ray 
will not reach the Earth’s surface. The effect of this internal reflection condition on the effective 
horizon is also discussed in Appendix 14.1. For the exaggerated case shown, zo = 60°, n = 0.8, 
and z = 63°. For a case with more realistic parameters, see Appendix 14.1. 


in the context of all-sky or very-wide-field imaging is that the static ionosphere 
acts like an achromatic spherical lens that bends incoming rays toward the zenith 
(Vedantham et al. 2014). 

To understand ionospheric refraction, consider a ray passing through a simple 
ionized layer as shown in Fig. 14.2. Note that the zenith angle of the ray at the 
bottom of the ionosphere is rather different than the zenith angle at the observer. 
That is, with the law of sines, 


z= sin”! Vere (14.14) 
i ro + hi ej? f 


We can apply the law of sines to the triangle involving the upper boundary of the 
layer, as well as Snell’s law, to obtain the bending angle of interest, Az = z — Zo, 
where z = z2: + 6; + 62 and z2, 01, and 0> are defined in Fig. 14.2. Az is always 
> 0. The details of the calculation of Az are given in Appendix 14.1. 
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The case of a radially stratified ionosphere can be handled by rewriting 
Eq. (13.27) in the form (see Sukumar 1987, for details) as 


(14.15) 


h 
_ Asinzy l f” [1 T 4] ne(h)dh 
a 52 372° 


ro v2 2 
f [0 + £) — sin? J 


where rọ is the radius of the Earth, n,(h) is the profile of the electron density as 
a function of height, h, and A = e*/827meo. Note that v = An, [see Eq. (14.5], 
so that Eq. (14.15) could be written as a vertical distribution of v,. Note also that 
Az = 0 in the zenith direction and would go to zero for ro approaching infinity, as 
expected. Since h < ro, for zo < 1, the deflection is approximately 


Asi 1 f} 
Az = asf ne(h) dh . (14.16) 
o 6" Jo 


r 


The ionosphere can be modeled to reasonable accuracy with a parabolic electron 
density distribution of the form 


(14.17) 


2(h = hm)? 
Ah? : 


ne(h) = neo [i 


as described by Bailey (1948) for |h — h,,| < 1/ /2, and where h, is the height of 
the peak of the electron density, neo, and Ah is the thickness of the layer. In this 
case, the bending is approximately given by 


Ahsi 2 hm 2h, \ 2 
Ag (2) (: + e) (osz re a) (14.18) 
3ro v ro ro 


The excess path length (see Eqs. 13.4 and 13.5) in the zenith direction can be 
calculated using Eqs. (14.5) and (14.11) with the assumption that v >> (v, and vg). 
The result is 


1? Pv 40.3 [° 
Ly x -f ur ALA, a ee ne(h) dh , (14.19) 
2 Jo v v? Jo 


where v is in hertz and n,(h) and v,(h) are the electron density (m~?) and plasma 
frequency as a function of height. The integral of electron density over height in 
Eq. (14.19) is called the total electron content (TEC) or column density. The excess 
path corresponds to a phase delay and is negative for the ionosphere. If we approx- 
imate the ionosphere by a thin layer at height h;, then the excess path length will 
vary as the secant of the zenith angle of the ray as it passes through the layer. Thus, 


L ~ Losec z , (14.20) 
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Fig. 14.3 (a) Ionospheric (a) 
bending angle vs. zenith 20 
angle at 1000 MHz from a 
ray-tracing calculation for the 
daytime electron density 
profile in Fig. 14.1. The 
bending predicted by 

Eq. (14.18), with parameters 
Vp = 12 MHz, h; = 350 km, 
Ah = 225 km, and 5 
ro = 6370 km, differs from 
the curve shown by no more 
than 5%. (b) Normalized 
ionospheric excess path 
length vs. zenith angle for the (b) 
same electron density profile 3 
from a ray-tracing calculation 
(solid curve) and from 

Eq. (14.20) (dashed curve). 
The total electron content is 
6.03 x 10!7 m~?, and the 
excess path length at the 
zenith is 24.3 m. The bending | 

and excess path length scale 

as v™?. The function forms 
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of the troposphere shown in o 20 40 60 80 
Fig. 13.6. Zo (degrees) 


u 


Az(arcsec) 
o 


T e 
20 40 60 80 


fo} 


where z; (see Fig. 14.2) is given by Eq. (14.14). Because of the diurnal variation in 
Ne, it may be important to use the ionospheric penetration coordinates (defined by 
6; and 62 in Fig. 14.2) to calculate Lo in Eq. (14.19) for a particular site. 

When z = 90° and h; = 400 km, secz; is only ~ 3. The secant law provides 
a reasonable model for estimating the excess ionospheric path length. A more 
complex model can be found in Spoelstra (1983)). Plots of Az from ray tracing 
and £ obtained from Eqs. (14.18) and (14.20) as well as from actual ray-tracing 
calculations are shown in Fig. 14.3. 

In some applications, it is necessary to correct the measurements of fringe 
frequency for the effects of ionospheric delay. The ionospherically induced fre- 
quency shift at an antenna is (v/c)d£/dt. The time rate of change in excess path 
length d£/dt has two components: one caused by the time rate of change of zenith 
angle dz/dt, and the other caused by the time rate of change of Lo, d£o/dt. At many 
times, especially near sunrise and sunset, the latter term may dominate (Mathur et al. 
1970; Hagfors 1976). 
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14.1.3 Calibration of Ionospheric Delay 


The excess ionospheric path length must be calibrated as accurately as possible 
in experiments involving precise determination of source positions or baselines. 
Three approaches are possible. In the first approach, models of the ionosphere 
can be constructed that depend on parameters such as geomagnetic latitude, solar 
time, season, and solar activity. Two such models are the International Reference 
Ionosphere (IRI) (Bilitza 1997) and the Parametrized Ionosphere Model (PIM) 
(Daniell et al. 1995). 

In the second approach, estimates of the total electron content can be obtained 
from measurements of the dual-frequency transmissions from the Global Position- 
ing System (GPS) (Ho et al. 1997; Mannucci et al. 1998). GPS has replaced the 
more traditional methods such as ionosondes, Faraday rotation of satellite signals, 
and incoherent backscatter radar (Evans 1969). The usefulness of GPS for phase 
correction of array data has been tested at the VLA (Erickson et al. 2001). Four 
GPS receivers were installed, one at the array center, and one at the end of each 
arm. The GPS receiver measured the TEC along lines of sight to the GPS satellite. 
When compared with interferometric phases at 330 MHz, the GPS system was 
effective in predicting wavefront slopes from large-scale structures (> 1,000 km) 
in the ionosphere. GPS methods have also been used in the calibration of VLBI 
observations (e.g., Ros et al. 2000). 

In the third approach, the differential path length effects can be virtually elimi- 
nated for unresolved sources by making astronomical observations simultaneously 
at two widely separated frequencies, vı and v2. If the interferometer phases are ¢ 
and ¢@2 at the two frequencies, then the quantity 


bc = $2 — (=) Qı (14.21) 


will preserve source position information and be substantially free of ionospheric 
delay effects. This technique will correct for the effects of all plasmas along the 
line of sight, not only the ionosphere. A small residual error remains because of 
higher-order frequency terms in the index of refraction and because the rays at 
the two frequencies traverse slightly different paths through the ionosphere. Dual- 
frequency observations are widely used in astrometric radio interferometry where 
source structure can be neglected [see, e.g., Sect. 12.6; Fomalont and Sramek 
(1975); Kaplan et al. (1982); Shapiro (1976)]. Note that the difference in TEC along 
the ray paths to the interferometer elements can be estimated from measurement of 
2 — (v2/¥1)¢ 1. Similar dual-frequency systems can be employed for the transfer of 
a local oscillator reference to a space-based VLBI station [see, for example, Moran 
(1989) and Sect. 9.10]. 
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14.1.4 Absorption 


Absorption in the ionosphere is caused by collisions of electrons with ions and 
neutral particles. At frequencies much greater than v,, the power absorption 
coefficient is 


æ = 2.68 x 10-7 (m7) , (14.22) 


where ve is the collision frequency and ne is in m~?. The collision frequency in hertz 
is approximately 


T —3/2 T 1/2 
= 61x10 (=) mt 18x10" (E) Mn 5 (14.23) 


where n; is the ion density and n, is the neutral particle density, both in m~+ 
(Evans and Hagfors 1968). Numerical values of absorption are listed in Table 14.1. 
Radiometric measurements of both electron temperature and opacity have been 
made by Rogers et al. (2015). 


14.1.5 Small- and Large-Scale Irregularities 


The small-scale irregularities in the electron density distribution introduce random 
changes in the wavefront of a passing electromagnetic wave. As a consequence, 
fluctuations in fringe amplitude and phase can be readily observed with an interfer- 
ometer at frequencies below a few hundred megahertz. In the early days of radio 
astronomy, signals from Cygnus A and other compact sources were observed to 
fluctuate on timescales of 0.1—1 min. At first, these fluctuations were thought to 
be intrinsic to the sources (Hey et al. 1946), but later observations with spaced 
receivers showed that the fluctuations were uncorrelated for receiver separations of 
more than a few kilometers (Smith et al. 1950). This result led to the conclusion that 
irregularities in the ionosphere were perturbing the cosmic signals. The predominant 
scale sizes in the ionization irregularities were found to be a few kilometers or 
less. The timescale of the fluctuations indicates that ionospheric wind speeds are 
in the range of 50-300 m s~!. The effects of these fluctuations have been studied 
extensively at frequencies between about 20 and 200 MHz and have been observed 
at frequencies as high as 7 GHz (Aarons et al. 1983). An early example of the 
fluctuations seen in interferometer measurements is given in Fig. 14.4. Hewish 
(1952), Booker (1958), and Lawrence et al. (1964) reviewed the early results and 
techniques. A comprehensive review of theory and observations of ionospheric 
fluctuations can be found in Crane (1977), Fejer and Kelley (1980), and Yeh and 
Liu (1982), and summaries of global morphology can be found in Aarons (1982) and 
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Fig. 14.4 (left) Typical records of the correlator output on three occasions from a phase-switching 
interferometer at Cambridge, England, having a 1-km baseline and operating at a wavelength of 
8 m. The irregular responses are caused by disturbances in the ionosphere. (right) Probability 
distributions of the angle of arrival deduced from the zero crossings of the correlator response. 
Reprinted with permission of and © the Royal Society, conveyed through Copyright Clearance 
Center Inc. From Hewish (1952). 


Aarons et al. (1999). Measurements with the GPS can be very useful in monitoring 
ionospheric fluctuations [e.g., Ho et al. (1996), Pi et al. (1997)]. The effects of 
ionospheric scintillation on a synthesis telescope have been described by Spoelstra 
and Kelder (1984). Loi et al. (2015a) report extensive measurements of ionospheric- 
induced position wander in sources at 150 MHz, and Loi et al. (2015b) demonstrated 
a parallax technique for determining the height of the perturbing layer. In Sect. 14.2, 
we discuss a theory of scintillation, which can be applied to the ionosphere as well 
as to the interplanetary and interstellar media. 

Large-scale variations in the electron density integrated along the line of 
sight are caused by traveling ionospheric disturbances (TIDs). TIDs, which are 
manifestations of acoustic-gravity waves in the upper atmosphere, are quasi- 
periodic, large-scale perturbations in electron density. The atmosphere has a natural 
buoyancy, so that a parcel of gas displaced vertically and released will oscillate at 
a frequency known as the Brunt—Vaisala, or buoyancy, frequency. This frequency 
is about 0.5-2 mHz (periods of 10—20 min) at ionospheric heights. For waves with 
frequencies above the buoyancy frequency, the restoring force is pressure (acoustic 
wave), and for waves with frequencies below the buoyancy frequency, the restoring 
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force is gravity (gravity wave). Hunsucker (1982) and Hocke and Schlegel (1996) 
have reviewed the literature on acoustic-gravity waves. There are many potential 
sources of TIDs, including auroral heating, severe weather fronts, earthquakes, and 
volcanic eruptions. Medium-scale TIDs have scale lengths of 100-200 km and 
timescales of 10-20 min and cause a variation in TEC of 0.5-5%. Such TIDs are 
present for a substantial fraction of the time. Large-scale TIDs, which are relatively 
uncommon, have scale lengths of 1000 km and timescales of hours and can cause 
variations in TEC of up to 8%. One such disturbance, excited by a volcano, was 
observed by VLBI (Roberts et al. 1982). A variety of ionospheric disturbances have 
been studied by observations of compact sources with the VLA [e.g., Helmboldt 
et al. (2012), Helmboldt (2014)]. 


14.2 Scattering Caused by Plasma Irregularities 


Understanding the propagation of radiation in a random medium is an important 
problem in many fields. The signals from cosmic radio sources propagate through 
several ionized random media, including the ionized interstellar gas of our Galaxy, 
the solar wind, and the ionosphere. In the observer’s plane, there are two effects. 
First, the amplitude varies with the position of the observer, which leads to temporal 
amplitude variations if there are relative motions among the source, scattering 
medium, and observer. Second, the image of the source is also distorted in a 
frequency-dependent manner. Much of the research in this area has been motivated 
by the attempt to understand the observational characteristics of pulsars [see, e.g., 
Gupta (2000)]. Propagation effects in the turbulent troposphere are described in 
Chap. 13. 


14.2.1 Gaussian Screen Model 


We begin the discussion by considering a simple model that serves to illustrate many 
features of the problem. This model was first developed by Booker et al. (1950)) to 
explain ionospheric scintillation and was refined by Ratcliffe (1956). Scheuer (1968) 
applied it to pulsar observations. The model assumes that the irregular medium is 
confined to a thin screen and that the irregularities (blobs) have one characteristic 
scale size a. Diffraction effects are neglected within the irregular medium; only 
the phase change imposed by the medium is considered. Diffraction is taken into 
account in the free-space region between the irregular medium and the receivers. 
The geometric situation is shown in Fig. 14.5. The thin-screen assumption is not 
particularly restrictive. However, the assumption that the screen is filled with plasma 
blobs having one characteristic size is restrictive and distinguishes this model from 
the power-law model, in which a range of scale sizes is present. From Eqs. (14.5) 
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Fig. 14.5 Geometry of a thin-screen scintillation model. An initially plane wave is incident on a 
thin phase-changing screen. The emerging wavefront is irregular. As the wave propagates to the 
observer, amplitude fluctuations develop, as suggested by the crossing rays. Below the antenna is 
a plot of intensity vs. position along the wavefront. If there is motion between the screen and the 
observer, the spatial fluctuations will be observed as temporal fluctuations in the power received or 
the fringe visibility. 


and (14.11), the index of refraction of the plasma can be written 


rend 
27 


nxl— 


l (14.24) 


where re is the classical electron radius, equal to e / 4reome? or 2.82 x 107 m, 
and the term in vg is neglected. Thus, the excess phase shift (a phase advance in this 
situation) across one blob is 


Ag, = reAa An, , (14.25) 
where An, is the excess electron density in the blob over the ambient level. If the 
thickness of the screen is L, then the wave will encounter about L/a blobs, and the 


rms phase deviation A@ will be Ag, L/a, or 


Ag = reà Aney La. (14.26) 
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Fig. 14.6 Path of a refracted ray in the thin-screen model. The rms scattering angle, 0s, is given 
by Eq. (14.27). 


The wave emerging from the screen is crinkled; that is, the amplitude is unchanged, 
but the phase is no longer constant and has random fluctuations with rms deviation 
Ag. The wave can therefore be decomposed into an angular spectrum of waves 
propagating with a variety of angles. The full width of the angular spectrum, 6, can 
be estimated by imagining that the random medium consists of refracting wedges 
that tilt the wavefront by the amount +A@A/2z over a distance a. Thus, 


IL 
b, = Larin =e (14.27) 
TT a 


If the source is not infinitely distant, then the incident wave will not be plane. In that 
case, the observed scattering angle 6/ depends on the location of the screen with 
respect to the source and the observer. Since 6, and 6’ are small angles, it follows 
from the geometry in Fig. 14.6 that 


R’ 
8 = — 9, , (14.28) 
R+R 

where R and R’ are defined in Fig. 14.6. Therefore, the effectiveness of the scattering 
screen is diminished if the screen is moved toward the source. This lever effect 
is very important in astrophysical situations. It can be used to distinguish galactic 
and extragalactic sources whose radiation passes through the same scattering screen 
(Lazio and Cordes 1998). 

Amplitude fluctuations build up as the wave propagates away from the screen. 
If the phase fluctuations are large, Af > 1, then significant amplitude fluctu- 
ations occur when rays cross (see Fig. 14.5). The critical distance beyond which 
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large-amplitude fluctuations are observed is 


Rex. (14.29) 


Note that if A@ = 2x, then Ry is the distance for which the size of a blob is 
equal to the size of the first Fresnel zone. The random electric field distribution 
at the Earth, in the plane perpendicular to the propagation direction, is called the 
diffraction pattern. It has a characteristic correlation length de, given by 


À 
de~ 5. (14.30) 


If the screen moves with relative velocity v, in the direction perpendicular to the 
propagation direction, so that the diffraction pattern sweeps across the observer, 
then the timescale of variability is 


de R À 
Tuy RHR bus 


(14.31) 


The signal reaching the observer by traveling along the scattered ray path is delayed 
by an amount 


RR'0? 


c 


with respect to the unscattered signal. The phase of the scattered wave is 27r vte with 
respect to the direct (unscattered) wave, and interference between these two waves 
causes scintillation. The bandwidth over which the relative phase changes by 27 is 
called the correlation bandwidth, Ave. The correlation bandwidth is the reciprocal 
of Te, and for the case R = R’ is 


8c 


Ave > — , 
€ — R02 


(14.33) 


where R, is the distance between the source and the observer. If the observations are 
made with a receiver of bandwidth greater than Av,, the amplitude fluctuations will 
be greatly reduced. Note from Eqs. (14.33) and (14.27) that Av, varies as A~*. 
Finally, if the source has two equal components separated by distance £, then 
each component will produce the same diffraction pattern, but these patterns will be 
displaced at the Earth by distance R/R’. If this distance is greater than de, then the 
diffraction pattern will be smoothed and the amplitude fluctuations reduced. Thus, 
if the source size is greater than a critical size 6., amplitude fluctuations will be 
sharply reduced because the diffraction patterns from the component parts overlap 
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and are smoothed out. From Eqs. (14.28) and (14.30), 6. can be written as 


À 


b= s 
RO, 


(14.34) 


Hence, only sources of small angular diameter scintillate. In the optical regime, 
the analogous phenomenon is that stars twinkle, but usually planets do not. An 
elegant application of Eq. (14.34) was made by Frail et al. (1997) to determine 
the angular size of the expanding radio source associated with a gamma-ray burst. 
They determined that the amplitude fluctuations in the radio emission, assumed to 
be caused by interstellar scattering, ceased during the first weeks after the burst, 
indicating that the source diameter had increased beyond the critical size of 3 pas 
at that time. 

A useful quantity is the ensemble average fringe visibility, V,m, measured by an 
interferometer in the presence of scintillation. Assume that the phases ¢, and ¢2 at 
two points along the phase screen, separated by distance d, are random variables 
with a joint Gaussian distribution with variance Ag? and normalized correlation 
p(d). p(d) is the correlation function of the phase or of the variable component of 
the index of refraction. The joint probability density function of the phase along the 
wavefront is 


P(p1, $2) = 


7 o , (1435) 


1 
In A$? V1- pd}? | 2A¢?[1 — p(d)?] 


where p(d) = (¢,¢2)/ Ad’, the correlation function of the phase fluctuations. The 
expectation of e/(¢1—92) is 


(etter) = f f eite (dy, pa) dhi doa , (14.36) 
which can be evaluated directly from Eq. (14.35) with the result 
(i00) = e748 -d (14.37) 
For a point source of flux density S, the ensemble average of the fringe visibility is 
(Vin) = Slee), (14.38) 
or 
(Vn) = SeT 4P eA] (14.39) 
If the source has an intrinsic visibility Vo, the ensemble average is 


(Vin) = Voe t l-e] (14.40) 
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This result was first derived by Ratcliffe (1956) and Mercier (1962). Note that the 
structure function of phase is Dg(d) = 2A¢7[1 — p(d)], so that Eq. (14.40) is 
equivalent to Eq. (13.80). In much of the early radio astronomical literature, p(d) 
is assumed to be a Gaussian function 


pd) = Fl | (14.41) 


where the characteristic scale length a corresponds to the blob size in the discussion 
above. This model, called the Gaussian screen model, is probably unrealistically 
restrictive because there are undoubtedly many scale sizes present. In the case in 
which Ap > 1, Vm decreases rapidly as d increases, and we need consider only 
the situation of d < a. Then, substitution of Eq. (14.41) into Eq. (14.40) yields 


(Vn) ~ Voe P ele (14.42) 


Thus, the intensity distribution of a point source observed through a Gaussian screen 
is a Gaussian distribution with a diameter (full width at half-maximum) of 


Agi /2In2 L 
jai ee reà? Ane | = - (14.43) 
mta JE a 


This formula for 6, is essentially equivalent to the one given in Eq. (14.27). In the 
case in which Ag < 1, the normalized visibility function drops from unity to eae 
when d > a. Therefore, the resulting intensity distribution for a point source is an 
unresolved core surrounded by a halo. The ratio of the flux density in the halo to the 
flux density in the core is e^t o]. 


14.2.2 Power-Law Model 


The spectrum of fluctuations in the electron density in ionized astrophysical plasmas 
is normally modeled as a power law, 


Peza a (14.44) 


where q is the three-dimensional spatial frequency (cycles per meter), q? = q? + 
g + È, and C2, characterizes the strength of the turbulence. The definition of C2, 
varies in the literature, depending on whether it is used as a constant in the spectrum 
or in the structure function. The two-dimensional power spectrum of phase [see 
Eq. (14.22) for the relation between Ad and Ane] is 


Po(a) = 27r} VLPs (14.45) 
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Hence, from Eq. (13.104), the structure function of phase is 
[0,6] 
Do(d) = 87°r7A?’L i [1 — Jo(qd)]Pne(q)q dq . (14.46) 
0 


For a power-law spectrum of the form of Eq. (14.44), the structure function is 


Da (d) = 8T’ rN C Lfd? , (14.47) 
where f (œ) is of order unity. The index @ is often taken to be 11/3, which is its value 
for Kolmogorov turbulence, for which f(w) = 1.45 [see Cordes et al. (1986)) for 
other values of f(a)]. The ensemble average of the interferometric visibility [see 
Eq. (13.80)] is 


(V) = Voe? , (14.48) 
or 
(V) = Vet N Cre flay? (14.49) 


The observed intensity distribution, the Fourier transform of Eq. (14.49), differs 
slightly from a Gaussian distribution, as can be seen in Fig. 13.1 1b. The scattering 
angle (full width at half-maximum) obtained from the width of the intensity 
distribution is 


6, = 4.1 x 1078 (C2, )35A!/> (arcsec) , (14.50) 


where A is in units of meters and C2,L is in m~!7/3 


Ge . Thus, a difference between the 
power-law model and the Gaussian screen model is that 6,, measured by Fourier 
transformation of visibility data over a range of baselines, is proportional to A? in 
the former model and to À? in the latter. Note that if (V) were measured on a single 
baseline, that is, with d fixed, and if @, were estimated from comparison of the 
measured visibility with the visibility expected for a Gaussian intensity distribution, 
6, would appear to vary as A? in both models. 

Measurements of visibility must be made over sufficiently long integration times 
to achieve an ensemble average if Eqs. (14.48), (14.49), and (14.50) are to be valid 
(Cohen and Cronyn 1974). A detailed discussion of the averaging time necessary to 
achieve an ensemble average is given by Narayan (1992) (see also Sect. 14.4.3). 

For plasmas, we can expect that the power law will hold from an inner scale qo 
to an outer scale qı; that is, there are no fluctuations on length scales smaller than 
Linner = 1/q, or larger than bouter = 1/qo. For the case in which gd < 1, that 
is, when the baseline is shorter than the inner length scale, the Bessel function in 
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Eq. (14.46) becomes 1 — q’r?/4, and the integration is straightforward, yielding 


2,27272 

Dg(d) = et Ene ze = ae (14.51) 
—a 

This is a very important result that has two interesting consequences. First, the 
structure function varies as d? regardless of a. Second, for a < 4, the structure 
function is dominated by the effect of the smallest irregularities, whereas for a > 4, 
it is dominated by the effect of the largest-scale irregularities. This result also 
suggests an important demarcation in phenomena between plasmas with a < 4 and 
those with œ > 4. The case in which a < 4 is called Type A (shallow spectrum), 
and the case in which a > 4 is called Type B (steep spectrum) (Narayan 1988). 

Consider the situation in which the spectrum has three regimes: 


Pe =C gs q < qo 
=q", qo<q4q<qı 
-0, gedi (14.52) 


Substitution of Eq. (14.52) into Eq. (14.46) gives 


Dg(d) = ad, d< 1/qi = finer 
d a—2 
~(4) : 1/q, <d < 1/q0 
do 
A d> 1/q0 = loiter , (14.53) 


where cı and c2 are constants, and we have introduced the normalization factor do, 
such that Dg(do) = 1, as in the discussion of the troposphere in Sect. 13.1.7. We 
have also assumed that 1/q1 < do < 1/qo. The constants needed to join the power- 
law segments are c1 = gi “dg? and c2 = (qodo)!®. This spectrum and structure 
function for the model are shown in Fig. 14.7. 


14.3 Interplanetary Medium 


14.3.1 Refraction 


Radio waves passing near the Sun are bent by the ionization of the solar corona and 
the solar wind. The general characteristics of the solar corona and the solar wind can 
be found in Winterhalter et al. (1996). Calculation of the refraction in the extended 
solar atmosphere is important for the understanding of solar radio emission at low 
frequencies, where the bending angles are large (Kundu 1965), and for tests of the 
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Fig. 14.7 (a) A model spectrum of the electron density fluctuations with inner and outer scales 
of spatial frequency go and q1. (b) The corresponding structure function of phase: see Eqs. (14.52) 
and (14.53). Note that inner = 27 q1 and Louter = 27 /qo. From Moran (1989), © Kluwer Academic 
Publishers. With kind permission from Springer Science and Business Media. 


general relativistic bending of electromagnetic radiation passing near the Sun (see 
Sect. 12.6). 

The electron density as a function of distance from the Sun can be measured in 
a variety of ways. Optical observations of Thomson scattering during solar eclipses 
have been analyzed to give an electron density model 


ne = (1.5576 + 2.9977!) x 10'* (m7) , (14.54) 


where r, the radial distance from the Sun in units of the solar radius, is less than ~4. 
Equation (14.54) is the well-known Allen—Baumbach formula (Allen 1947). 

The electron density profile over a broad range of radii can be determined from 
satellites that can track the plasma frequency measured during solar radio bursts. For 
example, observations with the Wind spacecraft with observations from 14 MHz to 
a few kHz could be reasonably represented by the model 


ne = 3.3 x 10%? + 4.1 x 10! 4* + 8.0 x 10% * (m73) (14.55) 


for 1.2 < r < 215 (Leblanc et al. 1998). The value of ne at r = 217 (1 AU) is 7.2 x 
10° m~3. This model is based on data taken near sunspot minimum, and the range 
of conditions is shown in Fig. 14.8. Ground-based measurements of radio sources 
during solar occultations (e.g., scintillations of the Crab Nebula) (Erickson 1964; 
Evans and Hagfors 1968) and dispersion measurements of pulsars (Counselman and 
Rankin 1972; Counselman et al. 1974) give about the same result as Eq. (14.55) for 
r> 10. 

The angle of refraction of a ray passing near the Sun can be calculated readily 
for the case in which this angle is small. A ray obeys Snell’s law in spherical 
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Fig. 14.8 The electron density vs. radial distance from the Sun, measured by the Wind satellite 
(orbiting at 1 AU) from observations of solar radio bursts. Scatter in data derived from observations 
at 11 epochs indicates the range of condition in the solar wind. For other data and in situ 
measurements, see Bougeret et al. (1984). From Leblanc et al. (1998), © Solar Phys. With kind 
permission from Springer Science and Business Media. 


Fig. 14.9 Path of a ray passing through the ionized gas surrounding the Sun. p is the impact 
parameter, and a is the solar elongation angle, that is, the angle between the Sun and the source in 
the absence of solar bending. 


coordinates, nr sinz = constant (Smart 1977), where n is the index of refraction 
and z is the angle between the ray and a line from the center of the Sun, as shown in 
Fig. 14.9. From this relation, the bending angle is found to be 


(14.56) 


mo dr 
a, | SEE aN 
4 mm ry (nr/p)2 —1 


14.3 Interplanetary Medium 747 


where rn is the distance of closest approach of the ray to the Sun, and p is the impact 
parameter (see Fig. 14.9). Assume that the electron density has a single power-law 
distribution given by 


ne = nor ® , (14.57) 


where neo is the electron density in m7? at one solar radius, and ß is a constant. For a 
fully ionized solar wind, characterized by a constant mass loss rate and velocity, ĝ is 
equal to 2. This case is applicable for r 2 10 (see Fig. 14.8). The index of refraction 
is obtained by substituting Eqs. (14.57) and (14.5) into Eq. (14.11) and neglecting 
the term in vg. Graphical solutions of Eq. (14.56) for large bending angles are given 
by Jaeger and Westfold (1950). For small bending angles, an approximate solution 
to Eq. (14.56) can be obtained by the use of the substitution nr/p = sec 0, 


O > 80.6./% ————p™ , (14.58) 
v rp 


where p is in units of the solar radius, and I is the gamma function. Note that the 
rays are bent away from the Sun. The bending angle associated with the model in 
Eq. (14.55) (using only the quadratic term) is 


0) ~ 2.4A7p~? (arcmin) , (14.59) 


where A is the wavelength in meters. For a multiple power-law model of electron 
density such as given in Eqs. (14.54) and (14.55), the bending angles for each 
component can be summed when the bending angles are small. 

There is another pedagogically interesting way to determine the bending angle 
from the change in excess propagation path with impact parameters. The excess 
(phase) path for a ray passing through the corona, for the case in which the effect of 
ray bending can be neglected, is, from Eq. (14.19), 


40.3 [° 
anc 


—oo 


where y is measured along the ray path as shown in Fig. 14.9. For a power-law model 
given by Eq. (14.57), the excess path is 


40.3ne9 [ dy 
ca | A 


which can be integrated to give 


403ml (=) 
v2 T ( 


x nop’? . (14.62) 
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The change in £ with p describes the tilting of the wavefront and is the bending 
angle; hence 0, ~ d£/dp (Bracewell et al. 1969). Differentiation of Eq. (14.62) 
with respect to p gives Eq. (14.58). 

We mention the effect of the general relativistic bending of waves passing close 
to the Sun here because the phenomenon can be described classically by an effective 
index of refraction given by 1 + 2GMọ /rc?, where G is the gravitational constant, 
and Mo is the mass of the Sun. The bending angle, for small values of p, is 
(Weinberg 1972) 


Ocr ~ —1.75p7! (arcsec) . (14.63) 


The negative sign indicates that the bending is toward the Sun, which is the opposite 
sense of bending by interplanetary medium. Measurements of the solar general 
relativistic bending are discussed in more detail in Sect. 12.6. 


14.3.2 Interplanetary Scintillation (IPS) 


Scintillation of extragalactic radio sources due to irregularities in the solar wind 
was first observed by Clarke (1964) and reported by Hewish et al. (1964). Clarke 
was studying 88 of the 3C sources at 178 MHz with the Cambridge one-mile 
interferometer. She noticed that three of the sources, the only ones smaller than 
2”, showed anomalous rapid (< 1 s) scintillation, which could not be attributed to 
the ionosphere. They were all within 30° in angle from the Sun. Interplanetary 
scintillation is readily distinguishable from ionospheric scintillation, since the 
timescale [Eq. (14.31)] and critical source size [Eq. (14.34)] are approximately 1 s 
and 0.5” for interplanetary scintillation and 30 s and 10 arcmin for ionospheric 
scintillation. Further observations of interplanetary scintillation by Cohen et al. 
(1967a) showed that the angular size of the radio source 3C273B is smaller 
than 0.02”, based on the application of Eq. (14.34). This result, and the long- 
baseline interferometric results, stimulated the development of VLBI. Interplanetary 
scintillation can be studied with the modern generation of low-frequency arrays 
[e.g., Kaplan et al. (2015)]. 

A comprehensive discussion of the interpretation of interplanetary scintillation 
can be found in Salpeter (1967), Young (1971), and Scott et al. (1983). For 
rough calculations, the scattering angle due to the interplanetary medium may be 
approximated by (Erickson 1964) 


r\2 
6, ~ 50 (=) (arcmin) , (14.64) 
P 


where À is in meters, and p, the impact parameter, is in solar radii. This relationship 
is based on measurements taken in 1960-61 at 11-m wavelength for impact 
parameters between 5 and 50 solar radii. Analysis of VLBI observations at 3.6 


14.3 Interplanetary Medium 749 


and 6 cm obtained in 1991 for a range of impact parameters of 10-50 solar 
radii led to a model for CŽ, of the form CZ, = 1.5x10!4(r/Rsun) >°" (Spangler 
and Sakurai 1995). Note that the power-law exponent is expected to be about —4 
from the elementary consideration that C2, is proportional to the variance of the 
electron density, which is proportional to the square of the density. For a constant 
wind speed, the density is proportional to r~*, and hence C2, is proportional to 
r~t. Deviations from 4 are caused by the radial dependence of the magnetic field 
strength, which plays a role in driving the turbulence. Integrating C2, along the line 
of sight, and using Eq. (14.50), we derive an estimate for the scattering angle of 
6, = 3100(p/A)~?? arcsec, which is comparable to the result in Eq. (14.64). 

The concept that extended sources do not scintillate as much as point sources [see 
Eq. (14.34)] can be generalized to obtain more information about source structure. 
We assume that the scintillation is caused by a screen at a distance R from the Earth, 
as shown in Fig. 14.6, where R < R;, and that the intensity at the Earth is /(x, y), 
where x and y are coordinates in a plane parallel to the screen in Fig. 14.5. The 
function A/(x, y) is equal to I(x, y) — (I(x, y)), where (I(x, y)) is the mean intensity. 
It has a power spectrum Sm (qx, qy) for a point source and S;(q,, qy) for an extended 
source, where qy and q; are the spatial frequencies (cycles per meter). If the visibility 


of the source is V(q,R, gyR), then it can be shown (Cohen 1969) that 


Silas qy) = S10(Ges 4) |V (aR, HR) , (14.65) 


where g,R and g,R correspond to the projected baseline coordinates u and v. The 
scintillation index of the source m, is defined by 


2_ (Alyy) _ OO a i 
"Tey? UG I = f gl de 49) daz day - (14.66) 


In principle, S;(qx, qy) could be computed from the simultaneous measurements 
of AI(x, y) with a large number of spaced receivers. In practice, the motion of the 
solar wind sweeps the diffraction pattern across a single telescope so that, from 
measurements of A/(t), the temporal power spectrum S(f) can be calculated. If the 
diffraction pattern moves with velocity v, in the x direction, then S(f) can be related 
to the spatial spectrum since qx = f / vs: 


1 [0.0] 
S(f) = z / Sı (a = La) dqy. (14.67) 


AY 


In principle, |V|? can be recovered from Eq. (14.65) by observing a source over a 
range of different orientations with respect to the solar wind vector. The situation 
is entirely analogous to that of lunar occultation observations (Sect. 17.2) except 
that with lunar occultation observations, the visibility phase can also be obtained. 
An estimate of the source diameter can be deduced from the width of the temporal 
power spectrum (Cohen et al. 1967b) or from the scintillation index [Eq. (14.66)] 
(Little and Hewish 1966). 
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Table 14.2 Typical values of the effects of the interstellar medium on radiation at 100 MHz 


Equation Frequency 
Effect number Magnitude* dependence? 
Angular broadening? 14.43 0.3” v? 
Pulse broadening® 14.32 1074 s v—4 
Scintillation bandwidth* 14.33 104 Hz v4 
Spectral broadening? = 1 Hz vl 
Scintillation timescale® 14.31 10s v 
Scintillation timescale = 10° s vy? 
Free-free optical depth 14.22 0.01 vy? 
Faraday rotation 14.71 10 rad vy? 


Adapted from Cordes (2000). 

“For a source in the Galactic plane at a distance of 1 kpc. Actual values can differ by an order of 
magnitude. 

bValid for the Gaussian screen model or the power-law turbulence model when Dg(@d) ~ @ [see 
Eq. (14.46)]. 

“Diffractive scattering. 

Refractive scattering (see Sect. 14.4.3). 


Interplanetary scattering is generally weak, except in directions close to the 
Sun. An interesting phenomenon is that the scintillation index, m,, increases 
monotonically with decreasing impact parameter, reaching m, ~ 1 for small 
diameter sources around p ~ 0.1 and then decreasing for smaller values of p [e.g., 
Armstrong and Coles (1978), Gapper et al. (1982), Manoharan et al. (1995)]. The 
effects of refractive scattering (discussed in the next section and in Sect. 15.3), 
which can be important in the strong scattering regime, have been studied by 
Narayan et al. (1989). 

A substantial effort has been made to study the 3-D characteristics of the 
interplanetary medium by monitoring the scintillation of radio sources over the 
past decades. See Manoharan (2012) for results from the Ooty Radio Telescope and 
Asai et al. (1998) and Tokumaru et al. (2012) for results from the Solar-Terrestrial 
Environment Laboratory of Nagoya University. Long-term trends are discussed by 
Janardhan et al. (2015). 


14.4 Interstellar Medium 


Table 14.2 lists the typical magnitudes and scale sizes of various effects caused by 
the interstellar medium. These are discussed individually in the following sections.! 


‘In this section, we follow the commonly used symbols DM, RM, and SM for dispersion measure, 
rotation measure, and scintillation measure. 
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14.4.1 Dispersion and Faraday Rotation 


The smooth, ionized component of the interstellar medium of our Galaxy affects 
propagation by introducing delay and Faraday rotation. The time of arrival of a 
pulse of radiation, such as that from a pulsar, is 
L 
d 
p= 2, (14.68) 
0 Ug 


where L is the propagation path, vg = cn is the group velocity, and n is given by 
Eq. (14.11), where we neglect the effect of the magnetic field. Differentiation of 
Eq. (14.68) gives 


dtp e L 


The integral of ne over the path length is called the dispersion measure, 
L 
DM = f Ne dy , (14.70) 
0 


which is the same quantity as the total electron content. dt,/dv can be estimated 
by measuring the time of arrival of pulsar pulses at different frequencies, and the 
dispersion measure can then be found from Eq. (14.69). If the distance to the pulsar 
is known, then the average electron density can be calculated. A typical value of 
(ne) in the plane of our Galaxy is 0.03 cm~’? (Weisberg et al. 1980). Alternately, 
if a pulsar’s distance is unknown, it can be estimated from Eq. (14.69) using an 
estimated average value of ne. 

The magnetic field of the Galaxy causes Faraday rotation of the polarization 
plane of radiation from extragalactic radio sources. Equation (14.12) can be 
rewritten 


Aw =)°RM , (14.71) 


where RM is the rotation measure given by 
L 
RM = 8.1 x 10° f neBy dy . (14.72) 
0 


Here, RM is in radians per square meter, À is in meters, By is the longitudinal 
component of magnetic field in gauss (1 gauss = 10~ tesla), ne is in cm~?, and 
dy is in parsecs (pc) (1 pe = 3.1 x 10!° m). The interstellar magnetic field can 
be estimated by dividing the rotation measure by the dispersion measure. Typical 


values of the magnetic field obtained in this way are 2 uG (Heiles 1976). This 


752 14 Propagation Effects: Ionized Media 


procedure underestimates the magnetic field if the field reverses direction along 
the line of sight. A formula for roughly estimating the rotation measure due to the 
galactic magnetic field is (Spitzer 1978) 


RM ~ —18| cotb| cos(£ — 94°) , (14.73) 


where £ and b are the galactic longitude and latitude. Extensive measurements of 
rotation measure as a function of direction can be found in Oppermann et al. (2012). 

Faraday rotation that occurs within a radio source depolarizes the emergent 
radiation. This depolarization happens because radiation emitted from different 
depths in the source suffers different amounts of Faraday rotation. Such a source 
might be a relativistic gas emitting polarized synchrotron radiation immersed in a 
thermal plasma that causes the Faraday rotation. The degree of polarization of the 
observed radiation can be succinctly described in a Fourier transform relationship 
when self-absorption is negligible. We first introduce the function M, the complex 
degree of linear polarization, defined by 


Q+jU 


M = m e”* = 
I 


(14.74) 
where mz, is the degree of linear polarization, y is the position angle of the electric 
field, and Q, U, and Z are the Stokes parameters as defined in Sect. 4.7. If y is the 
linear distance into the source, (y) is the intrinsic position angle of the radiation 
at depth y, j,(y) is the volume emissivity of the source, and A”A(y) is the Faraday 
rotation suffered by radiation emitted at depth y, then the degree of polarization of 
the observed radiation can be written 


[0,6] 
/ me(y) jy (9) POPPO dy 


M(A’) = 0 oO 
iy (y) a 


The denominator in Eq. (14.75) is the total intensity. (y) is the Faraday depth, 
which increases monotonically into the source as long as the sign of the longitudinal 
magnetic field direction does not change. In any case, we can superpose all the 
radiation from the same Faraday depth and write the integrals in Eq. (14.75) as a 
function of £ instead of y, yielding 


(14.75) 


M°) = f i (Bye? ap , (14.76) 
where 


mOj 0) 


i AOS 


(14.77) 
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Thus, M(A*) and F(8) form a Fourier transform pair. F(B) is sometimes called 
the Faraday dispersion function. Unfortunately, F(8), in general, cannot be found 
since M cannot be measured for negative values of A*. Because of this difficulty 
with the Fourier transform, F (£) is usually estimated by model fitting. However, if 
w(y) is constant, then M(—A*) = M*(A?), and F(B) can be obtained by Fourier 
transformation. 

Consider the result for a simple source model for which me, Y, and j, are 
constant. From Eq. (14.76), we have 


sin A27RM 


M(A’) = M(0) | T, 


| ef RM (14.78) 


where RM is the Faraday rotation measure through the whole source. If the 
Faraday rotation originates in front of the radiation source, the complex degree of 
polarization is 


M(A2) = M(0) eP% PM | (14.79) 


In this case, there is no depolarization, and the Faraday rotation is twice that of 
Eq. (14.78), in which the source is uniformly distributed throughout the rotation 
medium. For detailed treatment of intrinsic Faraday rotation, see Burn (1966), 
Gardner and Whiteoak (1966), and Brentjens and de Bruyn (2005). 


14.4.2 Diffractive Scattering 


Diffractive interstellar scattering has been extensively investigated by observation 
of pulsars and compact extragalactic radio sources. For pulsars, the temporal 
broadening of the pulses [Eq. (14.32)], the decorrelation bandwidth [Eq. (14.33)], 
and the angular broadening [Eq. (14.27)] can be measured. Interpretation of the 
measurements in terms of a thin-screen model suggests that Ane/ne ~ 107° and 
that the scale size responsible for the scintillation is on the order of 10!! cm. The 
temporal variations or scintillation of the signal from a pulsar are caused by the 
motions of the observer and the pulsar relative to the quasi-stationary interstellar 
medium. A measurement of the decorrelation bandwidth can be used to estimate 
the scattering angle [Eq.(14.33)]. This estimate of the scattering angle and the 
measurement of the timescale of fading (107-10° s at 408 MHz) can be used to 
estimate the relative velocity of the scattering screen by Eq. (14.31). From the 
relative velocity of the screen, the transverse velocity of the pulsar can be found. 
Velocities, and thus proper motions, of pulsars estimated in this way (Lyne and 
Smith 1982) agree with those measured directly with interferometers [see, e.g., 
Campbell et al. (1996)]. The transverse component of the orbital velocity of a binary 
pulsar has also been measured (Lyne 1984). 
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Observations show that the fluctuations in electron density can be described by a 
power-law spectrum with a power-law exponent of about 3.7 + 0.3, which is similar 
to the value of 11/3 for Kolmogorov turbulence (Rickett 1990; Cordes et al. 1986). 
The power-law spectrum appears to extend over a range of scale sizes from less 
than 10!° cm to more than 10!° cm. The inner scale may be set by the proton 
gyrofrequency (~10/ cm) and the outer scale by the scale height of the Galaxy 
(~107 cm). Observational evidence for the inner scale is given by Spangler and 
Gwinn (1990). 

Extensive measurements of the angular sizes of extragalactic radio sources have 
been used to derive an approximate formula for 6, [see Eq. (14.27)] based on the 
Gaussian screen model, by Harris et al. (1970), Readhead and Hewish (1972), 
Cohen and Cronyn (1974), Duffett-Smith and Readhead (1976), and others. This 
formula is 


15 
6, ~ ———A? (mas), |b| > 15° (14.80) 


| sinb] 


where b is the Galactic latitude and A is the wavelength in meters. The pulsar data 
have been interpreted by Cordes (1984) in terms of the power-law model to arrive 
at approximate formulas for 6,: 


ð; ~ 7.5A!"/5 (arcsec) , |b| < 0°6 

~ 0.5| sin b| 73/541! (arcsec) , 0°6 < |b| < 3°- 5° 

~ 13| sin b|7?⁄5111/5 (mas) , |b] > 3°-5°. (14.81) 
The accuracy of the representations in Eqs. (14.81) decreases with decreasing |b|. In 
particular, the scattering angle at low latitudes, |b| < 1°, can take on a wide range 
of values (Cordes et al. 1984). A much more detailed model with 23 parameters 
characterizing the electron distribution in the Galaxy was constructed by Taylor and 
Cordes (1993). This model has been superseded by another model called NE2001 


(Cordes and Lazio 2002, 2003). They define a scattering measure to characterize the 
strength of turbulence given by 


s 
SM = f C dy, (14.82) 
0 


where C? is defined in Eq. (14.44). With this definition, the angular broadening of 
an extragalactic radio source is given by 


6, ~ T1v—!!/5sm3/> (mas), (14.83) 


where v is in GHz. There are several regions in the Galaxy of anomalously high 
scattering (Cordes and Lazio 2001). The most highly scattered source among them 
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is a quasar along the line of sight to a galactic HII region known as NGC6334B, 
which has an angular size of 3” at 1.5 GHz (Trotter et al. 1998). The apparent sizes 
of interstellar masers, which are mostly found in the Galaxy at low galactic latitudes, 
are sometimes set by interstellar scattering (Gwinn et al. 1988). 

An example of a compact radio source that suffers a high degree of interstellar 
scattering is Sagittarius A* at the dynamical center of our Galaxy. This source has an 
angular size of about 1.0” at a wavelength of 30 cm (1.5 GHz) [compared with 0.5” 
predicted by Eq. (14.81)]. The angular size varies approximately as the wavelength 
squared over the entire measuring range ~ 0.3—30 cm, as shown in Fig. 14.10. The 
measurements by Doeleman et al. (2008) show that the intrinsic source size exceeds 
the scattering size at 1.3 mm. If the scattering can be modeled accurately, as in 
the case of Sgr A*, then its effects on the image can in principle be removed. The 
observed visibility V, is the true visibility times V, = e Po! 2 [see Eq. (14.48)]. 
If, for example, Dg = aì? d, appropriate if the baseline is less than the inner scale 
of turbulence, then V, is a simple Gaussian function, and the true visibility can be 
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Fig. 14.10 A clear example of interstellar scattering demonstrated by the observed angular size 
of the compact source in the center of our Galaxy (Sgr A*). The measurements were made with 
interferometric arrays (Jodrell Bank at the longest two wavelengths, the Event Horizon Telescope at 
the shortest wavelength, and the VLBA at the intermediate wavelengths). In all cases, the visibility 
or image data were fitted with Gaussian profiles to determine the major axis (full width at half- 
maximum). Error bars not visible are smaller than the symbol size. The line is an approximate fit to 
the data at wavelengths longer than 6 cm and has the form A”. The A-squared dependence suggests 
that if the scattering is caused by a turbulent medium following the Kolmogorov prescription, it has 
an inner scale that is longer than the size of the measurement arrays [see Eqs. (14.51) and (14.53)]. 
Angular sizes for this plot were taken from Davies et al. (1976), Bower et al. (2004, 2006), Shen 
et al. (2005), and Doeleman et al. (2008). At 0.13 cm, the intrinsic source size exceeds the scattering 
size. Interstellar scattering was first identified as an image broadening agent in Sgr A* by Davies 
et al. (1976). 
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recovered as 
V =Vn/[Vs = Vp I, (14.84) 


The success of this inversion clearly depends on the signal-to-noise ratio. Further 
discussion of “deblurring” techniques can be found in Fish et al. (2014). 


14.4.3 Refractive Scattering 


The realization by Sieber (1982) that the characteristic periods of amplitude 
scintillations of pulsars, on timescales of days to months, were correlated with 
their dispersion measures led Rickett et al. (1984) to the identification of another 
important scale length in the turbulent interstellar medium, the refractive scale dref. 
Refractive scattering is important in the strong scattering regime (dọ < dFresnel = 
VAR), where dp is the diffractive scale size defined by Dy (do) = 1. The refractive 
scale is the size of the diffractive scattering disk, which is the projection of the 
cone of scattered radiation on the scattering screen, located a distance R from the 
observer. The diameter of the diffractive scattering disk is RO,. The scattering disk 
represents the maximum extent on the screen from which radiation can reach the 
observer. With a power-law distribution of irregularities, it is the irregularities at the 
maximum allowed scale that have the largest amplitude and are the most influential. 
Thus, the refractive scale is dyer ~ RO,. Since 0, ~ A/do, we can write 


AR 
dg = — (14.85) 
do 
or 
Tres 
rep = ES (14.86) 
do 


The scale lengths dref and dy are widely separated. Hence, the timescale associated 
with scintillation scattering for a screen velocity of Us, tef = dref/Vs, is much 
longer than that associated with diffractive scattering, tait = do/v;. Suppose that 
a source is observed through a scattering screen located at a distance of 1 kpc, at 
b ~ 20°, and a wavelength of 0.5 m. For this case, the diffractive scale length is 
2 x 10° cm, the Fresnel scale is 4 x 10!! cm, and the refractive scale is 8 x 10! cm. 
The typical velocity associated with the interstellar medium is 50 km s~! (the sum 
of the Earth’s orbital motion and the motion of the Sun with respect to the local 
standard of rest; see Table A10.1). For this velocity, the diffractive and refractive 
timescales for amplitude scintillation are 6 min and 6 months, respectively. Sgr A*, 
in addition to its diffractive scattering, also shows the effect of refractive scattering 
in its visibility function (see Fig. 14.11). 
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Fig. 14.11 The effect of refractive interstellar scattering on Sgr A* seen on a plot of fringe 
visibility (correlated flux density) vs. projected baseline at 23.8 GHz. The array consisted of the 
VLBA augmented with the phased VLA and the GBT. Note the logarithmic flux density scale. The 
projected baseline has been calculated so as to remove the source elongation. Errors are lo values. 
The solid line through the data shows the visibility model for a Gaussian diffraction-scattered disk 
with 735-mas diameter (see Fig. 14.10), while the dashed lines above 100 MA show the percentage 
of time the expected visibility for the refractive scattering component should be below the levels 
indicated for 97, 75, 50, 25, and 3% of the time, respectively. The inset shows a simulated image 
of Sgr A*, which shows the refractive substructure (tref > fint > tait) calculated from the algorithm 
described by Johnson and Gwinn (2015) and smoothed to 0.3 mas. Data from Gwinn et al. (2014). 


Refractive scattering is thought to be responsible for the slow amplitude vari- 
ations observed in some pulsars and quasars at meter and decimeter wavelengths. 
This realization solved the long-standing problem of understanding the behavior of 
“long-wavelength variables,” which could not be explained by intrinsic variability 
models based on synchrotron emission. The identification of two scales in the 
interstellar medium provides strong support for the power-law model. The two 
scales provide a way of estimating the power-law index, because the relative 
importance of refractive scattering increases as the power spectrum steepens. It is 
interesting to note that these two scales arise from a power-law phenomenon, which 
has no intrinsic scale. The scales are related to the propagation and depend on the 
wavelength and distance of the screen. 

In addition to amplitude scintillation, refractive scattering causes the apparent 
position of the source to wander with time. The amplitude and timescale are about 
equal to 6, and tref, respectively. The character of this wander depends on the power- 
law index of the fluctuations. Limits on the power-law index have been established 
from the limits on the amplitude of image wander in the relative positions among 
clusters of masers (Gwinn et al. 1988). 

Rare sudden changes in the intensity of several extragalactic sources, called 
Fiedler events, or extreme scattering events (Fiedler et al. 1987), are probably 
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caused by refractive scattering in the interstellar medium. In the archetypal example, 
the flux density of the extragalactic source 0954+658 increased by 30% and then 
dropped by 50% over a period of a month, after which it recovered in symmetric 
fashion. A large-scale plasma cloud presumably drifted between the source and the 
Earth, creating flux density changes due to focusing and refraction. 

Because there are two timescales associated with strong scattering in the inter- 
stellar medium, three distinct data-averaging regimes are important for constructing 
images from interferometry data obtained on a timescale fin. These are: tint > tref 
(ensemble average image), tref > tint > taf (average image), and fint < tait (Snapshot 
image). The characteristics of these image regimes are described by Narayan (1992), 
Narayan and Goodman (1989), and Goodman and Narayan (1989). For ensemble 
averaging [see Eqs. (14.48) through (14.50)], the image is essentially convolved with 
the appropriate “seeing” function. An example of an “average” image is shown for 
the simulation of Sgr A* in Fig. 14.11. For more analysis and simulations of images 
in various time regimes, see Johnson and Gwinn (2015). The snapshot regime offers 
intriguing possibilities for image restoration. In this regime, it should be possible 
to image the source with a resolution of A/d;er, which can be very much smaller 
than that achievable with terrestrial interferometry. In this case, the scattering screen 
functions as the aperture of the interferometer. Because of the multipath propagation 
provided by refractive scattering, which brings radiation from widely separated parts 
of the scattering screen to the observer, the effective baselines can be very large. See 
Sect. 15.3 for further discussion, including an observation by Wolszczan and Cordes 
(1987). 


Appendix 14.1 Refractive Bending in the Ionosphere 


In this appendix, we show that a ray incident on an ionospheric layer will be bent 
so that it has a smaller zenith angle upon arrival at the Earth’s surface, as shown in 
Fig. 14.2. Application of the law of sines to the two triangles with opening angles 
0; and 62 gives 


seen ome (A14.1) 


and 


SiN Zir sin z2 
== (A14.2) 
ro + hi + Ah ro + hi 


Snell’s law gives the relations 


nsin Zi, = sin zi (A14.3) 
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Fig. A14.1 The elevation angle of the radio horizon vs. frequency for various values of electron 
density in a uniform layer from 200 to 400 km, which approximates the F layer. The densities 
from the bottom to the top curves are (5, 3, 2, 1, and 0.5) x 10!2 m73, corresponding to plasma 
frequencies of 20.1, 15.6, 12.7, 9.0, and 6.4 MHz. The knees of the curves (shown only for 
the higher densities) occur at v ~ 4v, [see Eq.(A14.12)]. For frequencies above the knees, 
the radio horizon is given by Eq. (A14.8) for z = 90°. Below the knee, the radio horizon is 
limited by internal reflection and is given by Eq. (A14.11). From Vedantham et al. (2014). © Royal 
Astronomical Society, used with permission. 


and 
sin Zo, = N SİN Z2 . (A14.4) 
Combining relations Eqs. (A14.1)-(A14.4) gives 


SiN 2, = nsinz 


roth . 
= ——— nsinz;, 
ro + hi + Ah 


ro + hi ; 
= —— sinz 
ro + hi + Ah 
= l (A14.5) 
=> —— — sin š ; 
ro +h + Ah O 


Note that zp is related to z2, without reference to n. Since z = z2, + 0; + 62, the net 
bending angle is 


Az = zZ — Z0 = Zx + 01 +02- z0. (A14.6) 
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Since 6) = Zir — z2, 


9 _-1+fl ro. . 141 ro ; (A14.7) 
= sin — sin — sin — —— sin : . 
? nro + hi I nro + hi + Ah á 
Since 6; = Zo — Zi, 
jy = zo — sin™! Pe (A14.8) 
1 0 a o~ . 
The final result of Az in terms of sin zo is 
Az = sin! ek sin zo} — sin! sin z 
ro+hj+ Ah roth 
l r 1 r 
. =j 0 : _ gnarl) se 0 7 
+ sin ea snzol sın Eea sazo] : 
(A14.9) 
Note that 
Az = (Zər — 22) + (Zir — Zi) . (A14.10) 


As an example, let us assume that h; = 300km, Ah = 200km, ro= 6370km, 
ne = 3 x 10"! m~’, and v = 50MHz. From Eqs. (14.4) and (14.5), we find that 
vp = 4.9MHz and n = 0.9951. For z) = 75°, the other angles are z; = 67.29°, 
Zir = 67.98°, z2 = 64.16°, 22, = 63.59°, 01 = 7.71°, 62 = 3.81°,z = 75.11°, 
and Az = 0.11°. Equation (14.15) gives the same result. This demonstrates the 
counterintuitive result that the sign of the change in zenith angle is the same for the 
ionosphere and for the troposphere. 

For the case zo = 90°, the result is z = 90.22°, so that the radiation from 0.22° 
below the horizon can in principle be received. 

The phenomenon of internal reflection will occur when z; = 90°. This gives 
a critical zenith angle, z., below which incoming rays will not reach the observer, 
given by 


ro + hi 
——n. 
ro 


sin Ze = 


(A14.11) 
For ze = 90°, the frequency at which this effect limits the incoming zenith angle is 


va Jo x 4v. (A14.12) 
Zhi 


The combination of the normal refraction and the critical angle defines the radio 
horizon. An example of the radio horizon is shown in Fig. A14.1. The radio horizon 
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may affect studies of the Epoch of Reionization, as described by Vedantham et al. 
(2014). 
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Chapter 15 
Van Cittert-Zernike Theorem, Spatial 
Coherence, and Scattering 


This chapter is concerned with the van Cittert-Zernike theorem, including an 
examination of the assumptions involved in its derivation, the requirement of spatial 
incoherence of a source, and the interferometer response to a coherent source. 
Some optical terminology is used, for example, mutual coherence, which includes 
complex visibility. There is also a brief discussion of some aspects of scattering by 
irregularities in the propagation medium. Much of the development of the theory 
of coherence and similar concepts of electromagnetic radiation is to be found in 
the literature of optics. The terminology is sometimes different from that which has 
evolved in radio interferometry, but many of the physical situations are similar or 
identical. However, in spite of the similarity, the literature shows that in the early 
development of radio astronomy, the optical experience was hardly ever mentioned, 
an exception being the reference by Bracewell (1958) to Zernike (1938) for the 
concept of the complex degree of coherence. The van Cittert-Zernike theorem 
contains a simple formalism that includes the basic principles of correlation in 
electromagnetic fields. 


15.1 Van Cittert—Zernike Theorem 


We showed in Chaps. 2 and 3 that the cross-correlation of the signals received in 
spaced antennas can be used to form an image of the intensity distribution of a 
distant cosmic source through a Fourier transform relationship. This result is a form 
of the van Cittert-Zernike theorem, which originated in optics. The basis for the 
theorem is a study published by van Cittert in 1934 and followed a few years later 
by a simpler derivation by Zernike. A description of the result established by van 
Cittert and Zernike is given by Born and Wolf (1999, Chap. 10). The original form 
of the result does not specifically refer to the Fourier transform relationship between 
intensity and mutual coherence but is essentially as follows. 
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Fig. 15.1 (a) Geometry of a distant spatially incoherent source and the points Pı and P at which 
the mutual coherence of the radiation is measured. The source plane (X, Y) is parallel to the 
measurement plane (x, y) but at a large distance from it. (b) Similar geometry for measurement 
of the radiation field from an aperture in the (X, Y) plane that is illuminated from above by a 
coherent wavefront. The radiated field has a maximum at the point P2. Direction cosines (l, m) are 
defined with respect to the (x, y) axes in the measurement plane, and direction cosines (l, m’) are 
defined with respect to the (X, Y) axes in the plane of the aperture. 


Consider an extended, quasi-monochromatic, incoherent source, and let the 
mutual coherence of the radiation be measured at two points Pı and P2 in a plane 
normal to the direction of the source, as in Fig. 15.1a. Then suppose that the source is 
replaced by an aperture of identical shape and size and illuminated from behind by a 
spatially coherent wavefront. The distribution of the electric field amplitude over the 
aperture is proportional to the intensity distribution over the source. The Fraunhofer 
diffraction pattern of the aperture is observable in the plane containing P, and P2. 
The relative positions of the points Pı and P2 are the same in the two cases, but 
for the aperture, the geometric configuration is such that P lies on the maximum 
of the diffraction pattern. Then the mutual coherence measured for the incoherent 
source, normalized to unity for zero spacing between Pı and P2, is equal to the 
complex amplitude of the field of the aperture diffraction pattern at the position P1, 
normalized to the maximum value at P2. 

In this form, the theorem results from the fact that the behavior of both the 
mutual coherence and the Fraunhofer diffraction can be represented by similar 
Fourier transform relationships. Derivation of the theorem provides an opportunity 
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to examine the assumptions involved and is given below. The analysis is similar to 
that given by Born and Wolf but with some modifications to take advantage of the 
simplified geometry when the source is at an astronomical distance. First, we note 
that in optics, the mutual coherence function for a field E(t), measured at points 1 
and 2, is represented by 


1 T 
Ti2(u, v, T) = Poa z) E\()E5(t— t) dt, (15.1) 
=> -T 


where u and v are the coordinates of the spacing between the two measurement 
points, expressed in units of wavelength. T'j2(u,v,0), for zero time offset, is 
equivalent to the complex visibility V(u, v) used in the radio case. 


15.1.1 Mutual Coherence of an Incoherent Source 


The geometric situation for the incoherent source is shown in Fig. 15.la. Consider 
the source located in a distant plane, indicated by (X, Y). The radiated field is 
measured at two points, Pı and P2, in the (x, y) plane that is parallel to the source 
plane. In the radio case, these points are the locations of the interferometer antennas. 
It is convenient to specify the position of a point in the (X, Y) plane by the direction 
cosines (l,m) measured with respect to the (x, y) axes. The source is sufficiently 
distant that the direction of any point within it measured from P4 is the same as that 
measured from P2. The fields at Pı and P% resulting from a single element of the 
source at the point (l, m) are given by 


R —j2 — R 
Etm) =8 (hms — E) PEE RA , (15.2) 
G R; 
and 
R —j2 t—R 
E,(I,m,t) =8 (ime =) op pre io) , (15.3) 
c 2 


where &(/, m, t) is a phasor representation of the complex amplitude of the electric 
field at the source for an element at position (l, m). Rı and R3 are the distances from 
this element to Pı and P2, respectively, and c is the velocity of light. The exponential 
terms in Eqs. (15.2) and (15.3) represent the phase change in traversing the paths 
from the source to P, and P32. 
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The complex cross-correlation of the field voltages at P; and P2 due to the 
radiation from the element at (/, m) is, for zero time offset, 


(E: (1, m,t) E* (l,m, 1) 


Rı P Ro 
=(6(ume-“) è (rm1-2)) 
c c 


y SPI V(t — Ri/o)) expl j2z vt — R2/0)] 
RıRı 


R2 —Rı )) exp [j2xv(Rı — R2)/c] 


c Rı…R2 


= (eum, t) &* (. m,t— 
(15.4) 


where the superscript asterisk denotes the complex conjugate, and the angle brackets 
( ) represent a time average. Note that the source is assumed to be spatially 
incoherent, which means that terms of the form (Ej (lp, mp, t)E} (Iq, mq, t)), where 
p and q denote different elements of the source, are zero. If the quantity (R2 — R,)/c 
is small compared with the reciprocal receiver bandwidth, we can neglect it within 
the angle brackets of Eq.(15.4), where it occurs in the amplitude term for &. 
Equation (15.4) then becomes 


(E\(1,m,1) E$ (l,m, )) = (8(l, m, )E* (1, m, t)) exp [j2 v (Rı — R2)/c] l 
RiR 
(15.5) 


The quantity (E(/,m,t) &*(l,m,tf)) is a measure of the time-averaged intensity, 
I(l, m), of the source. To obtain the mutual coherence function of the fields at points 
P, and P2, we integrate over the source, using ds to represent an element of area 
within the (X, Y) plane: 


Ty2(u, v, 0) = J Nie d Fo dy 


15.6 
source R\ Ro i ) 


where u and v are the x and y components of the spacing between the points Pı 
and P measured in wavelengths. Note that (Rı — R2) is the differential distance in 
the path lengths from (l, m) in the source to Pı and P2. The points P; and Pz have 
coordinates (xı, y1) and (x2, y2) respectively, so u = (xı — x2)v/c and v = (yı — 
y2)v/c, where c/v is the wavelength. Thus, we obtain (Ro — R1) = (ul + vm)c/v. 
Because the distance of the source is very much greater than the distance between P; 
and P2, for the remaining R terms, we can put Rj ~ R2 ~ R, where R is the distance 
between the (X, Y) and (x, y) origins. Then ds = R?dl dm, and from Eq. (15.6), 


Ti2(u, v, 0) = I / I(l, m) e= m di dm . (15.7) 
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Since the integrand in Eq. (15.7) is zero outside the source boundary, the limits 
of the integral effectively extend to infinity, and the mutual coherence T12(u, v, 0), 
which is equivalent to the complex visibility V(u, v), is the Fourier transform of the 
intensity distribution /(/, m) of the source. This result is generally referred to as the 
van Cittert-Zernike theorem. However, it is instructive to examine the definition of 
the theorem in terms of the diffraction pattern of an aperture, given at the beginning 
of this section. 


15.1.2 Diffraction at an Aperture and the Response 
of an Antenna 


The Fraunhofer diffraction field of an aperture, as a function of angle, can be 
analyzed using the geometry shown in Fig. 15.1b. Here, an aperture is illuminated 
by an electromagnetic field of amplitude E(/, m, t), where again we use direction 
cosines with respect to the x and y axes to indicate points within the aperture as 
seen from Pı and P2. The (x,y) plane is in the far field of a wavefront from any 
point in the aperture, so such a wavefront can be considered plane over the distance 
P,P. The aperture is centered on the (X, Y) origin and is normal to the line from 
the (X, Y) origin to P2. The phase over the aperture is assumed to be uniform, and 
components of the field therefore combine in phase at P2. Thus, in the (x, y) plane, 
the maximum field strength occurs at P2. Now consider the field at the point P1, 
which has coordinates (x, y). The component of the field at P; due to radiation from 
an element of the aperture at position (l, m) is given by Eq. (15.2). The path lengths 
from the point (l,m) at the source to Pı and P2 are R; and Ro, respectively, and 
Ry — Ri = Ix + my. Thus, from Eq. (15.2), we can write 


—j2mv(t—R2/c) R , 
E\(1,m,t) = —— e (: m,t— 2) aa Le (15.8) 
1 C 


Again, for the remaining R terms, we put Rı ~ R ~ R. Integration over the aperture 
then gives the total field at P4, 


e727 v (t—R/c) R ; 
E(x, y) = ————— f E (sme *) e PAGE ds | (15.9) 
R aperture c 
where À is the wavelength, and the element of area ds is proportional to dl dm. 
The term on the right side that is outside the integral is a propagation factor that 
represents the variation in amplitude and phase over the path from the source to P3 
in Fig. 15. 1b. In applying the result to the radiation pattern of an aperture, we replace 
the time-dependent functions E and & by the corresponding rms field amplitudes, 
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which will be denoted by E and 6, respectively: 
E(x, y) « f f E(l, m) e 2/0) gy dm , (15.10) 
aperture 


where the propagation factor in Eq.(15.9) has been omitted. A comparison of 
Eqs. (15.7) and (15.10) explains the van Cittert-Zernike theorem as described at the 
beginning of this section. With the specified proportionality between the incoherent 
intensity and the coherent field amplitude, it will be found that 


See) _ Be (15.11) 
T2(0,0,0)  E(0,0) 

In Eqs. (15.7) and (15.10), the integrand is zero outside the source or aperture. Thus, 
in each case, the limits of integration can be extended to too, and the equations 
are seen to be Fourier transforms. The calculations of the mutual coherence of the 
source and the radiation pattern of the aperture yield similar results because the 
geometry and the mathematical approximations are the same in each case. It should 
be emphasized, however, that the physical situations are different. In the first case 
considered, the source is spatially incoherent over its surface, whereas in the second 
case, the field across the aperture is fully coherent. 

The result in Eq. (15.10) also gives the angular radiation pattern for an antenna 
that has the form of an excited aperture. The application to an antenna is more useful 
if the radiation pattern is specified in terms of an angular representation (l, m’) 
of the direction of radiation from the antenna aperture instead of the position of 
the point P1, and if the field distribution over the aperture is specified in terms of 
units of length rather than angle. (/',m’) are direction cosines with respect to the 
(X, Y) axes. Since the angles concerned are small, we can substitute into Eq. (15.10) 
x = RI, y = Rm',1 = X/R,m = Y/R, dl = dX/R, and dm = dY/R, and obtain 


E (l,m) « f f Exy (X, Yje PIANA] ax qY (15.12) 
aperture 


This is the expression for the field distribution resulting from Fraunhofer diffraction 
at an aperture [see, e.g., Silver (1949)]. It includes the case of a transmitting antenna 
in which the aperture of a parabolic reflector is illuminated by a radiator at the 
focus. If such an antenna is used in reception, the received voltage from a source 
in direction (J, m’) is proportional to the right side of Eq. (15.12). Thus, the voltage 
reception pattern V4 (l, m’), introduced in Sect. 3.3.1, is proportional to the right 
side of Eq. (15.12). 

To obtain the power radiation pattern for an antenna, we need the response 
in terms of JE (I ,m)|?. From an autocorrelation theorem of Fourier transforms, 
the squared amplitude of E(t ,m’) is equal to the autocorrelation of the Fourier 
transform of EU ,m') [see, e.g., Bracewell (2000), and note that this relationship 
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is also a generalization of the Wiener—Khinchin relationship derived in Sect. 3.2]. 
Thus, the power radiated as a function of angle is given by 


[E U m)? x 
f / [Exy(X, Y) x x Exy(X, VyjePIC/Al+ Om’) ay ay , 
aperture 


(15.13) 


where &(X, Y) * x &(X, Y) is the two-dimensional autocorrelation function of the 
field distribution over the aperture. To obtain absolute values of the radiated field, 
the required constant of proportionality can be determined by integrating Eq. (15.13) 
over 47 steradians to obtain the total radiated power and equating this to the power 
applied to the antenna terminals. In reception, the power collected by an antenna 
is proportional to the power radiated in transmission, so the form of the beam is 
identical in the two cases. To illustrate the physical interpretation of Eq. (15.13), 
consider the simple case of a rectangular aperture with uniform excitation of the 
electric field. The function Eyy(X, Y) is then the product of two one-dimensional 
functions of X and Y. If d is the aperture width in the X direction, the autocorrelation 
function in X is triangular with a width 2d, and Fourier transformation gives 
; 7 2 

| Ex (|? x ae] (15.14) 
In the 7 dimension, the full width of this beam at the half-power level is 0.8864 /d, 
for example, 1° for d/A = 50.8 wavelengths. For a uniformly illuminated circular 
aperture of diameter d, the response pattern is circularly symmetrical and is given 
by 

= ny. [2J (ndl./A)]? 

|E-,)| «| ndij | ; (15.15) 
where the subscript r indicates a radial profile in which l’. is measured from the 
center of the beam, and J; is the first-order Bessel function. The full width of the 
beam at the half-power level is ~ 1.031 /d. 

A more direct way of obtaining the Fraunhofer radiation pattern of an aperture 
antenna is to start by considering the field strength of the radiated wavefront as a 
function of direction, rather than the field strength at a single point Pı, as above. 
However, the method used was chosen to provide a more direct comparison with 
the interferometer response to a spatially incoherent source. For a more detailed 
analysis of the response of an antenna, see, for example, Booker and Clemmow 
(1950), Bracewell (1962), or the textbooks on antennas in the Further Reading of 
Chapter 5. 
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15.1.3 Assumptions in the Derivation and Application 
of the van Cittert-Zernike Theorem 


At this point, it is convenient to collect and review the assumptions and limitations 
that are involved in the theory of the interferometer response. 


1. Polarization of the electric field. Although the electric fields are vector quantities 
with directions that depend on the polarization of the radiation, the components 
received by antennas from different elements of the source can be combined in 
the manner of scalar quantities. The fields are measured by antennas at P; and 
P, and each antenna responds to the component of the radiation for which the 
polarization matches that of the antenna. If the fields are randomly polarized 
and the antennas are identically polarized, then the signal product in Eq. (15.4) 
represents half the total power at each antenna. However, the antenna polariza- 
tions do not have to be identical since, in general, the interferometer system will 
respond to some combination of components of the source intensity determined 
by the antenna polarizations. The ways in which the antenna polarizations can 
be chosen to examine all polarizations of the incident radiation are described in 
Sect. 4.7.2. Thus, the scalar treatment of the field involves no loss of generality. 

2. Spatial incoherence of the source. The radiation from any point on the source 
is statistically independent from that from any other point. This applies almost 
universally to astronomical sources and permits the integration in Eq. (15.6) 
by allowing cross products representing different elements of the source to be 
omitted. The Fourier transform relationship provided by the van Cittert-Zernike 
theorem requires the source to be spatially incoherent. Spatial coherence and 
incoherence are discussed in Sect. 15.2. Note that an incoherent source gives 
rise to a coherent or partially coherent wavefront as its radiation propagates 
through space. If this were not the case, the mutual coherence (or visibility) of 
an incoherent source, measured by spaced antennas, would always be zero. 

3. Bandwidth pattern. The assumption required in going from Eqs. (15.4) to (15.5), 
that (Ry — R1 )/c is less than the reciprocal bandwidth (Av)~!, can be written 


Av 1 Av 1 
— < , — < —, (15.16) 
v lgu v mqv 


where l4 and mg are the maximum angular dimensions of the source. This is 
the requirement that the source be within the limits imposed by the bandwidth 
pattern of the interferometer, which is discussed in Sect. 2.2. Conversely, the 
required field of view limits the maximum bandwidth that can be used in a single 
receiving channel. The distortion caused by the bandwidth effect is discussed 
further in Sect. 6.3.1 and, if not severe, can often be corrected. 

4. Distance of the source. For an array with maximum baseline D, the departure 
of the wavefront from a plane, for a source of distance R, is ~ D*/R. Thus, the 
far-field distance Ry, defined as that for which the divergence is small compared 
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with the wavelength J, is given by 
Rg > D/A. (15.17) 


The far-field condition implies that the antenna spacing subtends a small angle as 
seen from the source and results in the approximation for Fraunhofer diffraction. 
If the source is at a known distance closer than the far-field distance, then the 
phase term can be compensated. This may sometimes be necessary in solar 
system studies. For example, for an antenna spacing of 35 km and a wavelength 
of 1 cm, the far-field distance is greater than 1.2 x 10!! m, or approximately the 
distance to the Sun. On the other hand, the distances to sources in the near field 
such as Earth-orbiting satellites can be determined from measurements of the 
wavefront curvature (e.g., Sect. 9.11). When the source is in the far-field distance, 
no information concerning its structure in the line-of-sight direction is possible, 
only the intensity distribution as projected onto the celestial sphere. (Line-of- 
sight structure can be determined by modeling velocity structure.) 

5. Use of direction cosines. In going from Eqs. (15.6) to (15.7), the path difference 
(Ry — Rı) is specified in terms of the baseline coordinates (u,v) and angular 
coordinates (l, m). The expression for the path difference is precise if / and m are 
specified as direction cosines. In integration over the source, the element of area 
bounded by increments dl dm is equal to dldm/n, where n is the third direction 
cosine and is equal to V1 —/? — m?. In optics, derivation of the van Cittert- 
Zernike theorem usually involves the assumption that the source subtends only 
small angles at the measurement plane. Then / and m can be approximated by the 
corresponding small angles, and n can be approximated by unity. As a result, the 
relationship between V and J becomes a two-dimensional Fourier transform, as 
in the approximation for limited field size discussed in Sect. 3.1.1. In the radio 
case, the less restrictive result in Eq. (3.7) is sometimes required. 

6. Three-dimensional distribution of the visibility measurements. As antennas track 
a source, the antenna-spacing vectors, designated above by (u, v) components, 
may not lie in a plane, and three coordinates, (u,v,w), are then required to 
specify them. The Fourier transform relationship is then more complicated, but a 
simplifying approximation can be made if the field of view to be imaged is small. 
These effects are discussed in Sect. 11.7. 

7. Refraction in space. It has been implicitly assumed in the analysis above that 
the space between the source and the antennas is empty, or at least that any 
medium within it has a uniform refractive index, so that there is no distortion 
of the incoming wavefront from the source. However, the interstellar and 
interplanetary media, and the Earth’s atmosphere and ionosphere, can introduce 
effects including rotation of the position angle of a linearly polarized component, 
as discussed in Chaps. 13 and 14. 
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15.2 Spatial Coherence 


In the derivation of the interferometer response in Chaps. 2 and 3, and in Eq. (15.5), 
it is assumed that the source under discussion is spatially incoherent. This means 
that the waveforms received from different spatial elements of the source are not 
correlated, which enables us to add the correlator output from the different angular 
increments in the integration over the source. We now examine this requirement 
in more detail. To illustrate the principles involved, it is sufficient to work in one 
dimension on the sky, for which the position is given by the direction cosine /. 


15.2.1 Incident Field 


Consider the electric field E(/, t) at the Earth’s surface resulting from a wavefront 
incident from the direction / at time t. Figure 15.2 shows the geometry of the 
situation, in which / = 0 in the direction OS of the center, or nominal position, 
of the source under observation. / is a direction cosine measured from OB, the 
normal to OS. A path OS’ is shown that indicates the direction of another part of 
the source. Radiation from the direction OS’ produces a wavefront parallel to OB’. 
The wavefronts from points on the source are plane because we are considering a 
source in the far field of the interferometer. The line OA represents the projection 
of the baseline normal to the direction of the source, and the distance OA measured 
in wavelengths is equal to u. Now consider wavefronts from the directions S and S” 
that arrive at the same time at O. To reach the point A, the wavefront from S’ has 
to travel a farther distance AA’. With the usual small-angle approximation, we find 
that the distance AA’ is equal to ulc/v, that is, ul wavelengths. Thus, the wave from 
direction S” is delayed at A by a time interval t = ul/v, relative to the wave from 
S. If we represent the wave from direction S’ by E(/,t) at O, at A it is E(/,t — 1t). 


Fig. 15.2 Diagram to illustrate the variation of phase along a line OB that is perpendicular to 
the direction of a source OS, where / is the direction cosine that specifies the direction OS’ 
and is defined with respect to OB. The angle SOS’ is small and is thus approximately equal 
to l, as indicated. The line OS’ points toward another part of the same source, and OB’ is 
perpendicular to it. 


15.2 Spatial Coherence 777 


Now because the incident wavefronts are plane, the amplitude of the wave does not 
change over the distance AA’. However, the phase changes by vt = ul, so for the 
wave from S’ at A, we have 


E(l,t — 1) =E(Q,pje?™ . (15.18) 


If e(u, t) is the field at A resulting from radiation from all parts of the source, then 
ce P 
elu, t) = I E(, ted. (15.19) 
—0o 


It will be assumed that the angular dimensions of the source are not large, so also 
we have 


E(1,t) = 0, i] >1. (15.20) 


The condition specified in Eq. (15.20) allows us to write the limits of the integral in 
Eq. (15.19) as too. Note that Eq. (15.19) has the form of a Fourier transform, and 
the inverse transform gives E(/, t) from e(u, t). Equation (15.19) will be required in 
the following subsection. 


15.2.2 Source Coherence 


We now return to the spatial coherence of the source and follow part of a more 
extensive analysis by Swenson and Mathur (1968). As a measure of the spatial 
coherence, we introduce the source coherence function y. This is defined in terms 
of the cross-correlation of signals received from two different directions, l; and h, 
at two different times: 


i 1 fT P 
vh.h.t)= lim = | EU DE" (ast 1) di 
= (E(h, )E*(b,t—1)) . (15.21) 


Finite limits are used in the integral to ensure convergence. y (l, /2, T) is similar to 
the coherence function of a source or object discussed by Drane and Parrent (1962) 
and Beran and Parrent (1964). 

The complex degree of coherence of an extended source is the normalized source 
coherence function 


y(t) 


h, bh, t) = -=== , 
yn(lı, l2, T) yl, 0y 0) 


(15.22) 
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where y(/,,t) is defined by putting l = J) in Eq.(15.21), that is, y(,t) = 
y(h,b,t). It can be shown by using the Schwarz inequality that 
0 < |yv (1, l,7)| < 1. The extreme values of 0 and 1 correspond to the cases 
of complete incoherence and complete coherence, respectively. When dealing with 
extended sources of arbitrary spectral width, it is possible that, for a given pair of 
points J; and h, |yn(h, l2, T)| is zero for one value of t and nonzero for another 
value. Therefore, more stringent definitions of complete coherence and incoherence 
are necessary. The following definitions are adapted from Parrent (1959): 


1. The emissions from the directions /; and h are completely coherent (incoherent) 
if |yn(h, l2, T)| = 1 (0) for all values of t. 

2. An extended source is coherent (incoherent) if the emissions from all pairs of 
directions l4, l2 within the source are coherent (incoherent). 


In all other cases, the extended source is described as partially coherent. 

Consider now the coherence function of the field e(x,, 1) of a distant source 
measured, say, at the Earth’s surface, x, being a linear coordinate measured in 
wavelengths in a direction normal to / = 0: 


1 T 
T (x11, X12, T) = jim a e(xy1, te* (xa2,t — T) dt 
<i -T 
= (e(xu, De" (x12,t— T)} . (15.23) 


This is a variation of the mutual coherence function jz in Eq. (15.1), in which 
the positions of the measurement points defined by xj; and x32 are retained, rather 
than just the relative positions given by the baseline components. By using the 
Fourier transform relationship between E(I, t) and e(u, t) derived in Eq. (15.19), and 
replacing u by x}, we obtain 


[0,6] lo) 
T (x31, X22, T) = J i yh, hb, tre mhh dy) dh , (15.24) 
—oo ¥—0O 


and the inverse transform, which is 


[0,6] [0,6] 
y(h,b,t) = | f T(x, x2, T) P7922) dy dxa. (15.25) 
—oo J—0O 


The relationships in Eqs. (15.24) and (15.25) do not provide a means of measuring 
the intensity distribution of a source, except in the case of complete incoherence. 
For complete incoherence, the coherence function can be expressed as 


y(l, lh, Tt) =yh,t) 6; — h), (15.26) 


where ô is the delta function. Using the relation in Eq. (15.26) in conjunction with 
Eqs. (15.24) and (15.25), we find that the self-coherence function of a completely 
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incoherent source and its spatial frequency spectrum are Fourier transforms of each 
other: 


CO 
T(u,t) = J y(l, r)e "dl (15.27) 
—oo 


y(,t) = Í T (u, t) du , (15.28) 


where u = xy — x32. It is clear that T (u, Tt) is independent of x); and x,2 and 
depends only on their difference. Thus, u can be interpreted as the spacing of two 
sample points between which the coherence of the field is measured, and also as the 
spatial frequency of the visibility measured over the same baseline. For t = 0, from 
Eqs. (15.21) and (15.26), we obtain 


y(L 0) = (EDI?) , (15.29) 


which is the one-dimensional intensity distribution of the source, /;, introduced in 
Eq. (1.10). Then from Eqs. (15.27) and (15.29), 


P(u,0) = J i (IEP) edl. (15.30) 


(oe) 


T (u, 0) is measured between points along a line normal to the direction / = 0. As 
measured with an interferometer, it is also the complex visibility V. Eq. (15.30) 
is the Fourier transform relationship between mutual coherence (visibility) and 
intensity. 

When the incoherence condition in Eq. (15.26) is introduced into Eqs. (15.24) 
and (15.25), two results appear: the van Cittert—Zernike relation between mutual 
coherence and intensity, and the stationarity of the mutual coherence with respect 
to u. The physical reason underlying these results is seen in Fig. 15.2. When the 
wavefronts incident at different angles combine at any point, the relative phases of 
their (Fourier) frequency components vary linearly with the position of the point 
(e.g., the position of A along the line OB in Fig. 15.2), and for small /, they also 
vary linearly with the angle on the sky. As a result, the phase differences of the 
Fourier components at two points depend only on the relative positions of the points, 
not their absolute positions. Interferometer measurements of mutual coherence 
incorporate the phase differences for a range of angles of incidence governed by 
the angular dimensions of the source and the width of the antenna beams. The 
linear relationship between phase and position angle allows us to recover the angular 
distribution of the incident wave intensity from the variation of the mutual coherence 
as a function of u, by Fourier analysis. If the angular width of the source is small 
enough that the distance AA’ in Fig. 15.2 is always much less than the wavelength, 
then the form of the electric field remains constant along the line OA, and the source 
is not resolved. 
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Parrent (1959) has shown that an extended source can be completely coherent 
only if it is monochromatic. As examples of such a source, one may visualize the 
aperture of a distant, large antenna, or an ensemble of radiating elements all driven 
by the same monochromatic signal. The aperture considered in Sect. 15.1.2 is a 
conceptual example of a coherent source. The difference between the responses of 
an interferometer to a fully coherent source and to a fully incoherent one can be 
explained by the following physical picture. The source can be envisioned as an 
ensemble of radiators distributed over a solid angle on the sky. In the case of a 
coherent source, the signals from the radiators are monochromatic and coherent. 
The radiation in any direction combines into a single monochromatic wavefront 
and produces a monochromatic signal in each antenna of an interferometer. The 
output of the correlator is directly proportional to the product of the two (complex) 
signal amplitudes from the antennas. Thus, if a coherent source is observed with 
na antennas, the na(na — 1)/2 pairwise cross-correlations of the signals that are 
measured can be factored into n, values of complex signal amplitude. 

In contrast, for an incoherent source, the outputs from radiating elements 
are uncorrelated and must be considered independently. Each one produces a 
component of the fringe pattern in the correlator output. But since the phases of 
these fringe components depend on the positions of the radiators within the source, 
the combined response is proportional not only to the signal amplitudes at the 
antennas but also to a factor that depends on the angular distribution of the radiators. 
This factor, of magnitude < 1, is equal to the modulus of the visibility normalized 
to unity for an unresolved (point) source of flux density equal to that of the source 
under observation. Unless the source is unresolved, it is not possible to factor the 
measured cross-correlations into signal amplitude values at the antennas. Because 
the emissions of the radiating elements of a source are uncorrelated, the information 
on the source distribution is preserved in the ensemble of wavefronts they produce 
at the antennas. 

As shown by the derivation of the angular dependence of the radiation from a 
coherently illuminated aperture [Eq. (15.12)], and suggested by the analogy with a 
large antenna, the radiation from a coherent source is highly directional. Thus, the 
signal strengths observed depend on the absolute positions of the two antennas of 
an interferometer, as in Eqs. (15.24) and (15.25), not only on their relative positions, 
as is the case for an incoherent source. The ability to factor the signal outputs from 
a series of baselines, and the nonstationarity of the correlator output measurements 
with the absolute positions of the antennas, are two characteristics that could allow a 
coherent source to be recognized (MacPhie 1964). From the analysis in Sect. 15.1, it 
is clear that a similar range of antenna spacings is required to resolve an incoherent 
source or to explore the radiation pattern of a coherent source of the same angular 
size. 


15.3 Scattering and the Propagation of Coherence 781 
15.3 Scattering and the Propagation of Coherence 


It is well known that optical telescope images of single stars made with exposure 
times short compared with the timescale of atmospheric scintillation exhibit mul- 
tiple stellar images (see Sect. 17.6.4). These images result from the scattering of 
light from the star by irregularities in the Earth’s atmosphere. Something closely 
analogous to this occurs in the case of imaging of an unresolved radio source 
through a medium with strong irregular scattering, such as the interplanetary 
medium within a few degrees of the Sun, as described in Sect. 14.3. Since each 
scattered image results from the emission of the same source, one is led to expect 
that such a situation would simulate the effect of a distribution of coherent point 
sources. In this section, we examine the effects of scattering by considering the 
propagation of coherence in space, following in part a discussion by Cornwell et al. 
(1989). This formalism suggests methods for the recovery of the unscattered image 
from the observed image. 

Given a radiating surface, we wish to know the mutual coherence function on 
another (possibly virtual) surface in space. In the typical radio astronomy situation, a 
number of simplifying assumptions can be made about the geometry of the problem. 
Consider the situation illustrated in Fig. 15.3, in which narrowband radio waves 
propagate from surface S to surface Q. The mutual coherence of two points in space 
is the expectation of the product of the (copolarized) electric fields at the two points. 


Fig. 15.3 Simplified geometry for examining the propagation of coherence. S represents an 
extended source, Q is the location of a scattering screen, and B is the measurement plane. Surfaces 
S, Q, and B are plane and parallel, and r1, r2, dı, and dy are much greater than the wavelength. All 
rays are nearly (but not necessarily exactly) perpendicular to the surfaces. 
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For signals correlated with arbitrary time delay, the mutual coherence is 


T'(Q), Q2,t) = (E(Q1, t)E*(Q2,t—T)) . (15.31) 


The mutual coherence function I is a function of the field at two points and the 
time difference t. We consider the propagation of mutual intensity, that is, the 
mutual coherence evaluated for t = 0. Following common practice, we represent 
the mutual intensity by J(Q;, Q2) = T'(Q, Q2,0). J will be subscripted by S, Q, 
or B to indicate the corresponding plane (Fig. 15.3) of the mutual intensity value. 
We assume that the emitting surface is completely incoherent, as is usually the case 
for astronomical objects, and that the observed radiation is restricted to a narrow 
band of frequencies, as dictated by the characteristics of the receiving system. From 
Eq. (15.31) and the Huygens—Fresnel formulation of radiation, it can be shown 
(Born and Wolf 1999, Goodman 1985), by a calculation similar to the one used 
in deriving Eq. (15.6), that the mutual intensity for points Q; and Q) is 


exp[—j2m(r| — r2)/A] 


ri 12 


Jo(Q1, Q2) = a> f fass dS, dS , (15.32) 
S 


where dS, dS) is a surface element of S, and A is the wavelength at the center of the 
observed frequency band. 

The condition of incoherence can be represented by the use of a delta function 
(Beran and Parrent 1964), as in Eq. (15.26). Here, the mutual intensity is represented 
by a delta function, and thus, the intensity distribution on the surface Q is found by 
allowing points Q; and Q> to merge: 


Is(Si1, 9) = A71(S,) 5(S1 — S) , (15.33) 


where the factor A? has been included to preserve the physical dimension of 
intensity. Equation (15.32) then becomes 


exp[—j2m(r) — r2)/A] dS 


ri r2 


Jo(Q1, Q2) = [asi (15.34) 


When the angular dimension of the source is infinitesimal, that is, when the source is 
unresolved, the integration over the source becomes trivial, and the mutual intensity 
can be factored into terms depending, respectively, on rı and 72: 


one) (cere) 


ri 


Jo(Q1, Q2) = I(S) ( (15.35) 


r2 


where rı and r2 now originate at a single point S. In the more general case of 
a resolved source, Eq. (15.34) cannot be factored. Equations (15.34) and (15.35) 
describe for their respective cases the propagation of mutual coherence in situations 
subject to the constraints of Fig. 15.3 and thus can be used to determine the 
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mutual intensity on surface Q resulting from incoherent radiation from surface 
S. Examination of Eq. (15.31) reveals that, for the extended source S, the mutual 
intensity on Q depends on both rı and rp for all pairs of points on Q. Thus, the field 
at Q is at least partially coherent for all sources, including those of finite extent. This 
is intuitively reasonable, as all points on Q are illuminated by all points on S. In fact, 
it can be demonstrated rigorously that an incoherent field cannot exist in free space 
(Parrent 1959). 

Suppose now that we have a situation in which the surface Q is actually a screen 
of irregularities in the transmission medium, such as plasma or dust, which scatters 
the radiation from S. The mutual intensity incident on the screen is modified by a 
complex transmission factor T(Q) to produce the transmitted mutual intensity 


Jo(Q1, Q2) = T(Q1)T* (Q2)Joi(Qi, Q2) , (15.36) 


where subscripts i and ¢ indicate the incident and transmitted mutual intensity, 
respectively. From Eq. (15.34), we now define a “propagator” (Cornwell et al. 1989) 
for mutual intensity: 


W(S,B) = / T(Q) expl=j2n(r + d)/A] dS, (15.37) 


S rd 


where r and d are defined in Fig. 15.3. Then the mutual intensity on surface B is 
given, in terms of the mutual intensity of an extended source S, by 


Pe ee i 1 Is(S,.5:)W(S,.Bi)W*(S), B) dS; dS. (15.38) 
S 
For an incoherent extended source, 
Jg(Bı, B2) = a> f OWS. BDW*6S, B2)dS , (15.39) 
5 


and for a point source of flux density F, the mutual intensity on B becomes 
Jg(B1, Bo) = FA™?W (S, B1) W* (S, Bo) . (15.40) 


Again, for the unresolved source, the mutual intensity on B consists of two factors, 
each depending only on one position on B. However, for an extended incoherent 
source distribution on S, the mutual intensity depends on differences in position and 
therefore cannot be factored. 

The existence of a scattering screen between a source and an observer, with an 
instrument of limited aperture, raises the possibility of greatly increased angular 
resolution resulting from the much larger extent of the scattering screen. The partial 
coherence of radiation from the screen requires that the intensity be measured 
at all points on the measurement plane B, spaced as dictated by the Nyquist 
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criterion, rather than at all points in the spatial frequency spectrum as allowed by 
the van Cittert-Zernike theorem. The former observing mode results in very much 
more data than does the latter. In two spatial dimensions, a large redundancy of 
data results, so that in principle, not only can the scattering screen be characterized, 
but the source as well. In this respect, the problem is similar to that of self- 
calibration (Sect. 11.3.2). Unfortunately, in the case of the scattering screen, the 
practical difficulties of such observations are enormous, and few significant attempts 
have been made to apply the principle. Cornwell and Narayan (1993) discuss 
the possibilities of statistical image synthesis using scattering to obtain ultrafine 
resolution in a manner somewhat analogous to speckle imaging (see Sect. 17.6.4). 

Emission from a radio source that undergoes strong scattering during propagation 
through space has been investigated by Anantharamaiah et al. (1989), and Cornwell 
et al. (1989). To demonstrate the response of a radio telescope to such a spatially 
coherent source distribution, they observed the strong and essentially pointlike 
source 3C279, which passes close to the Sun each year. Under these conditions, the 
scattering is strong enough to cause amplitude scintillation of the received signals. 
Anantharamaiah and colleagues used the VLA in its most extended configuration, 
for which the longest baselines are approximately 35 km. The velocity of the solar 
wind, of order 100-400 km s7!, causes irregularities to sweep across the array 
in ~100 ms, so it was necessary to make snapshot observations of duration 10- 
40 ms to avoid smearing of the image by the movement of the scattering screen. 
Observations were made at wavelengths of 20, 6, and 2 cm, with the source at 
angular distances of 0.9° to 5° from the Sun. It was found that the correlator 
output values could be factored as expected for a coherent source. When correlated 
signals were averaged for about 6 s, an enlarged image of the source was obtained, 
and the enlargement increased as the distance from the Sun decreased. It was 
also demonstrated that it would be possible to determine the characteristics of 
the scattering screen by measuring the mutual intensity function on the ground, 
provided that the latter is measured completely in the two-dimensional spatial 
frequency domain. It is not possible to distinguish between a spatially coherent 
extended source and a scattering screen illuminated by a point source. 

A significant observation was made by Wolszczan and Cordes (1987), who 
were able to infer the dimensions of structure within pulsar PSR 1237+25 from an 
occurrence of interstellar scattering. The pulsar was observed with a single antenna, 
the 308-m-diameter spherical reflector at Arecibo, at a frequency of 430 MHz. 
Dynamic spectra of the received signal (i.e., the received power displayed as a 
function of both time and frequency) showed prominent band structure with maxima 
separated by ~ 300-700 kHz in frequency. This was interpreted in terms of a 
thin-screen model of the interstellar medium, in which refraction of rays from 
the pulsar occurred at two separated points in the screen. The analysis of such a 
model is complicated by the occurrence of both diffractive and refractive scattering, 
resulting from structure smaller and larger than the Fresnel scale, respectively 
(Cordes et al. 1986). The refraction gave rise to two images of the source at the 
radio telescope, resulting in fringes in the intensity of the received signal. The 
distance of the pulsar (0.33 kpc) and its transverse velocity (178 km s~!) were 
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known from other observations, and the distance of the screen was taken to be half 
the distance of the pulsar. It was deduced that the angular separation of the images 
was ~ 3.3 mas, corresponding to a spacing of ~1 AU (astronomical unit) between 
the refracting structures. In effect, the refracting structures constitute a two-element 
interferometer, with fringe spacing ~ 1 uas. For comparison, the angular resolution 
of a baseline equal to the diameter of the Earth at 430 MHz would be 44 mas. The 
particular conditions that resulted in this observation lasted for at least 19 days, and 
during that period, observations of other pulsars did not show similar scattering. 
This strongly suggests that the observed phenomenon resulted from a fortuitous 
configuration of the interstellar medium in the direction of the pulsar. 

Apart from cases of scattering such as that described, there are essentially no 
clear cases of spatially coherent astronomical sources, although coherent mecha- 
nisms may occur in pulsars and masers (Verschuur and Kellermann 1988). Fully 
coherent sources are not amenable to synthesis imaging using the van Cittert- 
Zernike principle and thus do not fall within the area of principal concern of 
this book. Further material on coherence and partial coherence can be found, for 
example, in Beran and Parrent (1964), Born and Wolf (1999), Drane and Parrent 
(1962), Mandel and Wolf (1965, 1995), MacPhie (1964), and Goodman (1985). 
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Chapter 16 
Radio Frequency Interference 


A basic requirement of radio astronomy is access to a spectrum in which obser- 
vations can be made without detrimental interference from transmissions by other 
services. In the early years of radio astronomy, when most of the radio astronomy 
bands below a few GHz were allocated, bandwidths of radio astronomy systems 
were generally no greater than a few MHz, and the comparable allocated bandwidths 
largely sufficed. Some allocations were made for radio lines, most importantly the 
hydrogen (H1) line, for which 1420-1427 MHz was reserved. In the following 
decades, as radio astronomy at frequencies in the range of tens of GHz developed, 
bandwidths of order 1 GHz were allocated, and later, a substantial fraction of the 
spectrum above ~ 100 GHz was allocated to radio astronomy. However, spillover 
of radiation from transmitting services into radio astronomy bands occurs, and 
generally it has been necessary to choose observatory sites in radio-quiet areas of 
low population density and to take advantage of terrain shielding where possible. 
These considerations have led to the choice of sites in South Africa and Western 
Australia for international development of several of the largest arrays. Also, with 
the increase in computing capacity at observatories, detection and removal of 
interfering signals in astronomical observations have become important parts of data 
analysis. In particular, digital analysis allows the received bandwidths to be divided 
into as many as 10° spectral channels, which allows those containing interference to 
be identified and removed. A general discussion of interference in radio astronomy 
is given by Baan (2010). 

The most serious interference usually results from intentional radio radiation 
such as those used for transmission of information in many forms, or for radio 
location, etc. Interference can also occur as a result of unintentional emissions from 
electrical machinery and industrial processes such as welding. Such emissions often 
take the form of trains of short electromagnetic pulses. For example, a rectangular 
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pulse of width ôt has a power spectrum 
. 2 
P(v) ~ oe] (16.1) 
vot 
Most of the power is contained in the frequency range between DC and 1/ôt. 
However, the envelope of the power spectrum decreases only as v~*, and hence, 
such interference can be a problem at frequencies much higher than 1/ôt. Elec- 
tromagnetic interference (EMI) of this type is usually most serious at frequencies 
below a few GHz. A detailed analysis of radiation produced by sparks in power 
lines was developed by Beasley (1970). To avoid such interference, sites for radio 
astronomy observatories are located in relatively undeveloped areas, and proximity 
to industries and major highways is avoided. Necessary vehicles and machinery on 
observatory sites are generally fitted with filtering components that strongly reduce 
unwanted emissions. Shielding of electronic equipment is very important. 


16.1 Detection of Interference 


A basic problem is the identification of the contaminated data. In the simplest case, 
this is a matter of examining the output of a correlator or detector and deleting 
data in which the signal amplitude is larger than expected or does not vary with 
time or antenna pointing in the manner expected for astronomical sources. In 
earlier times, interference removal sometimes meant losing the whole bandwidth 
received, but as mentioned above, use of multichannel spectral processing permits 
deletion of only the contaminated channels. The greatest difficulty is the detection 
of weak interference. Use of channel bandwidths comparable to the bandwidth of 
an interfering transmitter also has the advantage of maximizing the interference-to- 
noise ratio, thus improving the detectability. 

When inspecting data for variations that indicate the presence of interference, 
data averaging times of seconds or minutes are often appropriate when the inter- 
ference varies on similar timescales. However, the astronomical measurements 
may require averaging of data over periods of many hours to obtain the required 
sensitivity, so interference at levels that can introduce errors in the data may be 
too weak relative to the noise to be easily detected. With the high data output 
rates produced by large synthesis arrays, it is impractical to examine all of the 
data manually, and algorithms by which contaminated data can be flagged by 
computer are important. Methods of dealing with radio interference include (1) 
those that simply delete receiver output data that are believed to be contaminated 
by interference; (2) those that cancel or reduce the interference without removing 
the astronomical data that occur at the same time and frequency; and (3) those that 
involve spatial filtering, in which a null is generated in the antenna reception pattern 
in the direction of an interferer. 

The common characteristic of interfering signals is that, in general, their sources 
do not move relative to the antennas at the sidereal rates of astronomical objects. 
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When the effects of the sidereal motion of the astronomical target are removed 
from the correlated data, the fringe frequency variations are transferred to any 
extraneous signals, and thus the unwanted signals can be identified by the fringe 
rate variations in their phases. For long baselines where the fringe rates are high, the 
interference will be attenuated by the subsequent time averaging of the data (e.g., 
Perley 2002). However, in many cases, some further analysis is required to remove 
the effects of the interference, as considered in Sect. 16.2. Athreya (2009) describes 
the application to the Giant Metrewave Radio Telescope array in India. 

Some examples of techniques to detect the presence of interference are as 
follows. For a general overview, see also Fridman and Baan (2001), Briggs and 
Kocz (2005), and Baan (2010). 


1. Use of monitoring receivers with antennas pointed toward likely sources of 
interference, such as toward the horizon for terrestrial transmitters (e.g., Rogers 
et al. 2005). 

2. Comparison of data taken simultaneously at two observatories that are suffi- 
ciently widely separated that interference from any transmitter is unlikely to be 
received at both. This has been used in the search for pulses and other transient 
astronomical emissions (e.g., Bhat et al. 2005). 

3. Detection of transmissions with cyclostationary characteristics, that is, transmis- 
sion in which some characteristic repeats at intervals t, in time. Examples are the 
frame cycles in TV signals and repeated data cycles in GPS (global positioning 
system) transmissions. Values of te can be determined for the expected signal 
environment. The occurrence of components of the data with cyclostationary 
characteristics can be investigated by performing an autocorrelation and looking 
for features that repeat at intervals te. Bretteil and Weber (2005) find that search- 
ing for a Fourier component at frequency 1/t, in the data is an advantageous 
method. 

4. Use of the closure amplitude relationships [Eq.(10.44)] can provide an 
indication of interference. If observations are made of a point source in the 
absence of interference, the visibility ratios on the right side of the equation 
are equal to unity, and the values of rj on the left side are proportional to the 
corresponding main-beam gains. A signal from an interfering source will add a 
component to the output, the magnitude of which will depend upon the sidelobe 
gains of the antennas, which will vary with time as the antennas track. Thus, 
variation of the closure gain values is an indication of possible interference. 
Note, however, that the correction for the fringe frequency of the target source 
will cause variation of the same frequency in the response to interference from 
a stationary transmitter, so the effect of the interference will decrease (i.e., the 
interference threshold will increase) with an increase of the baseline component 
u, as discussed in Sect. 16.3.2. 

5. In the case of interference from radar pulses, the individual pulses are sometimes 
strong enough to be seen by looking at the output of a detector, especially if the 
transmitter is close enough that a direct signal path exists. It may be possible 
to determine the timing pattern of the pulses and generate blanking pulses for 
the radio astronomy receivers. The situation may be such that it is necessary to 
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extend the blanking to include reflections from nearby aircraft, etc. See, e.g., the 
discussion by Dong et al. (2005). A buffer memory for the data allows blanking 
to begin just before the pulse is detected, to ensure effective removal. 

6. An interesting object lesson is the identification of solitary radio bursts of 
millisecond duration with swept frequency vs. time characteristics similar to 
those of pulsars. These events were known as perytons. The radiation entered 
multiple channels of the telescope’s multibeam receivers, showed peculiar kinks 
in the time frequency dependence, and preferentially occurred at midmorning. 
These characteristics led to the source: prematurely opened doors of microwave 
ovens at the observatory (Petroff et al. 2015). 

7. Examination of the statistics of the receiver output data. Kurtosis is defined as 
H4/ H3, where u2 and u4 are the mean values of the second and fourth powers of 
the data with respect to the mean value. For Gaussian noise, kurtosis has a value 
of 3.0, and other values are an indication of non-Gaussian data (Nita et al. 2007; 
Nita and Gary 2010). 

8. With high-resolution multichannel receivers, interference can be detected and 
excised from deviant channels (e.g., Leshem et al. 2000). 


16.1.1 Low-Frequency Radio Environment 


The LOw Frequency ARray (LOFAR array; see Sect. 5.7.1), which is located in the 
Netherlands with long baseline extensions in other countries, covers the frequency 
ranges 10-80 and 110-240 MHz, thus avoiding the FM broadcast band. Discussions 
of the problem of radio interference in these frequency ranges are given by Boonstra 
and van der Tol (2005) and Offringa et al. (2013). The latter provides a detailed 
examination of the radio environment in the 30-78 MHz and 115-163 MHz ranges. 
For these measurements, the received signals were split into 512 sub-bands of 
195 kHz width. The spectral resolution of the data was 0.76 kHz. Offringa et al. 
(2013) found that the interference occupancy was 1.8% for the lower band and 3.2% 
for the higher one. They concluded that these levels of narrowband interference 
should not significantly restrict astronomical observations, but that it is important 
that the frequency range of LOFAR remain free of broadband interference. A 
similar analysis has been carried out for the Murchison Widefield Array in Western 
Australia (Offringa et al. 2015). 


16.2 Removal of Interference 


When possible, mitigation of interference by cancellation, leaving the astronomical 
data intact, is clearly preferable to total deletion of the corrupted data. Cancellation 
requires not only detection but also an accurate estimation of the interfering signal 
in order to remove it. In adaptive cancellation [see, e.g., Barnbaum and Bradley 
(1998)], a separate antenna (usually smaller than the astronomy antenna) is pointed 
toward the interferer. The received signal from this antenna is digitized, passed 
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through an adaptive filter, and combined with the signal from the astronomy 
antenna. The combined outputs are processed by an algorithm that provides a control 
of the adaptive filter in such a way as to cause the interfering signal voltages from 
the two antennas to cancel each other. Of various algorithms that could be used 
to control the adaptive filter, Barnbaum and Bradley used a least-mean-squares 
algorithm, which is computationally simple and thus easily adapted to follow the 
relative variation of the astronomical and interfering signals as the astronomy 
antenna tracks. All of this takes place before the signals reach the correlator or 
detector. Briggs and Kocz (2005) give an example of an interference cancellation 
scheme in which the outputs of the astronomy and interference antennas are cross- 
correlated to provide a control for the adaptive filter. For a detailed discussion of 
methods involving cross-correlation of astronomy antenna outputs with those of 
axillary antennas directed toward the source of interference, see Briggs et al. (2000). 
In some cases in which the structure of the interfering signal is known in detail, as 
in the case of the GLONASS (Global Navigation Satellite System) navigational 
satellite signals, it is possible to recreate the interfering signal from the interference 
received in the astronomy antenna with sufficient accuracy for cancellation. In an 
example discussed by Ellingson et al. (2001), interference from GLONASS was 
reduced by 20 dB. 


16.2.1 Nulling for Attenuation of Interfering Signals 


Spatial nulling involves using a group of antennas in which a null in the combined 
spatial response is formed in the direction of the source of interference. In low- 
frequency arrays in which the individual receiving elements are dipoles with beams 
covering much of the sky, such nulling may also result in loss of astronomical sky 
coverage. 

In deterministic nulling, the direction of the interferer is known, and a null is 
formed in that direction by weighting the signals received. Weighting factors (in 
amplitude and phase) can be applied to the signals from individual antennas if they 
are being combined, as in a phased array, or to the correlated products from antenna 
pairs before they are combined to form an image. It is not necessary to be able 
to identify the interference within the received signal, but if the angular responses 
in the direction of the null differ from one antenna to another, it is necessary to 
calibrate the antenna responses, which may not be practicable for the far sidelobes. 
Deterministic nulling can be applied to a synthesis array in two ways. First, the 
nulls can be formed by adjusting the weights with which the cross products of the 
outputs of pairs of antennas (the visibility values) are combined. In this case, the 
nulls are formed in the synthesized beam pattern, i.e., most likely in the sidelobes of 
the synthesized beam unless the direction of the interferer is within the synthesized 
field. Second, in the case of synthesis arrays in which the elements between which 
cross-correlations are formed are themselves phased subarrays of antennas, the nulls 
can be formed in the subarray beams. In this case, the weighting is applied directly 
to the signals from the individual antennas. 
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16.2.2 Further Considerations of Deterministic Nulling 


Consider an array of n nominally identical antennas, each of which is connected 
through a phase shifter to an n-to-1 power combiner. Each picks up a power level 
p from an interfering signal. In the power combiner, the power is divided n ways 
between the other antennas and the output. Thus, each antenna contributes a power 
level p/n to the combiner output. The voltage contributions of the antennas can be 
represented by vectors of amplitude Vp/n. If the phase shifters are adjusted so that 
the contributions combine in phase, the vectors are aligned and the output voltage 
is np. The output power is np, as expected, since the total collecting area is n 
times that of a single antenna. Now suppose that the phase shifters are set so that 
the signal vectors combine with random phase angles. The combined voltage has 
an expectation of „y/n times that of a single antenna. Thus, the expectation of the 
combined power received is equal to that from a single antenna, p (see Sect. 9.9 for 
a related discussion of phasing of arrays). Finally, consider the case in which the 
phases are adjusted so that the vectors form a closed loop with zero resultant, thus 
producing a null in the direction of incidence of the signal. If each signal vector 
has a random error in amplitude and phase of relative rms amplitude €, then the 
vector sum will fail to close by an amount equal to the sum of the errors, i.e., 
~ €,/p, resulting in a power level of €?p. Thus, a null of depth x dB below the 
response of a single antenna requires € = 10~*/°, e.g., € = 0.03 for a null depth 
of 30 dB. These requirements on the accuracy of the voltage responses apply to the 
interference components identified in adaptive nulling and to the accuracy of the 
antenna responses in deterministic nulling. In closing the vector loop for a null, the 
shape of the loop is not constrained, so free parameters remain for forming beams 
or nulls in other directions. 

In forming a null in a given direction, one can start by determining the complex 
gain factors required to close the vector loop on the assumption that the antennas are 
all ideal isotropic radiators. Then, to take account of the actual gain of the antennas, 
each signal vector has to be multiplied by a further complex gain factor. If this 
second gain factor is the same for each antenna, the size and orientation of the 
vector loop may be changed, but it will remain closed. Thus, the response factor 
in the direction of the interferer need not be known so long as it is identical for all 
antennas. If the gain factor differs from one antenna to another, as is likely to be 
the case for signals received through far sidelobes, the loop will not close unless the 
individual gain factors are known and taken into account. The need to calibrate the 
far sidelobes of a high-gain reflector antenna over a large fraction of 47 steradians, 
and perhaps also as a function of frequency across receiver bandwidth, limits the 
usefulness of deterministic nulling in such cases. Deterministic nulling is discussed 
by Smolders and Hampson (2002); Ellingson and Hampson (2002); Ellingson and 
Cazemier (2003); Raza et al. (2002); and van der Tol and van der Veen (2005). 
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16.2.3 Adaptive Nulling in the Synthesized Beam 


A way of removing the effects of an interfering signal is to place a null in the 
reception pattern of an array of antennas in the direction of incidence of the 
interference. This can be done in software and is referred to as adaptive nulling. 
In general, the direction of incidence of the interference is not known and must 
be deduced from the observations. Adaptive nulling, in which the system reacts 
automatically to an interfering signal, generally requires that the interference is not 
too strong and originates from a single source. It can also result in a significant 
computing load. Details can be found in Leshem and van der Veen (2000); Ellingson 
and Hampson (2002); Raza et al. (2002); and van der Tol and van der Veen (2005). 


16.3 Estimation of Harmful Thresholds 


In the efforts to obtain protection for radio astronomy observations in the systems 
of frequency regulation within the International Telecommunication Union (ITU), 
and also within the regulatory systems of individual nations, it has been essential for 
radio astronomers to provide quantitative estimates of the threshold levels of signal 
power that are harmful to astronomical observations. These vary with frequency 
and also with the type of radio telescopes involved. This section is concerned with 
the estimation of these harmful thresholds, particularly for the frequency bands 
allocated to radio astronomy. 

The ultimate limit on the sensitivity of a radio telescope is set by the system noise, 
and an interfering signal can generally be tolerated if its contribution to the output 
image is small compared with the noise fluctuations. A response to interference of 
one-tenth of the rms level of the noise in the measurements is a useful criterion in 
interference threshold calculations. The corresponding flux density of such a signal 
can be calculated if the effective collecting area of the antenna, in the direction of the 
interference, is known. Except at the longer wavelengths, radio astronomy antennas 
usually have narrow beams, and the probability of the interfering signal being 
received in the main beam or nearby sidelobes is low, especially if the interfering 
transmitter is ground based. Thus, we assume here that interference usually enters 
the far sidelobes of the antenna. Figure 16.1 shows an empirical model curve for 
the maximum sidelobe gain as a function of angle from the main-beam axis. This 
curve is derived from the measured response patterns of a number of large reflector 
antennas. For the present estimate, it is appropriate to use a gain of 0 dBi (i.e., 0 dB 
with respect to an isotropic radiator), which occurs at about 19° from the main beam. 
Zero dBi is also the mean gain of an antenna over 47 steradians, and the effective 
collecting area for this gain is equal to A?/4s, where A is the wavelength. If F, 
(W m7?) is the flux density of an interfering signal within the receiver passband, the 
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Fig. 16.1 Empirical sidelobe-envelope model for reflector antennas of diameter greater than 100 
wavelengths. Measurements on antennas show that 90% of sidelobe peaks lie below the curve. 
Sidelobe levels can be reduced by 3 dB or more in designs in which aperture blockage by feed 
structure is eliminated or minimized. The model shown is representative of large antennas with 
tripod or quadrupod feed supports of the type commonly used in radio astronomy. From ITU-R 
Recommendation SA.509-1 (1997). 


interference-to-noise power ratio in the receiver is 


F,,A2 


Re 16.2 
4mukTs Av ( ) 


where k is Boltzmann’s constant, Ts is the system noise temperature, and Av is 
the receiver bandwidth. In this expression, it is assumed that the polarization of 
the interfering signal matches that of the antenna. Since radio astronomy antennas 
commonly receive two polarizations, crossed linear or opposite circular, choice 
of antenna polarization is of little help in avoiding interference. In practice, the 
received level of the interfering signal varies with time because of propagation 
effects and the tracking motion of the radio telescope, which sweeps the sidelobe 
pattern across the direction of the transmitter. 

For comparison with correlator systems, we first consider the simpler case of 
a receiver that measures the total power at the output of a single antenna. The 
interference-to-noise ratio of the output, after square-law detection and averaging 
for a time Ta, is expression (16.2) multiplied by /Avt,. This result follows 
from considerations similar to those discussed in Sect. 6.2.1. Then for an output 
interference-to-noise ratio of 0.1, which we use as the criterion for the threshold of 
harmful interference, 


0.42kTsv2 VA 
paper pee al a (16.3) 
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Fig. 16.2 Curves of the estimated harmful threshold of interference in dBW m~? Hz~!. The 
lowest curve is for total power (TP) measurement on a single antenna, and the topmost curve 
is for VLBI. The data in these are from ITU-R Recommendation RA.769-2 (2003) and essentially 
represent the considerations in Sects. 16.3 and 16.5. The smaller variations result from the 
characteristics of individual instruments. The two middle curves in the figure represent synthesis 
arrays and are based on the VLA at the most compact and most extended configurations, for which 
the corresponding spacings vary by a factor of about 35. The major feature in all of the curves is the 
increase with frequency, which results mainly from the effective collecting area of the receiving 
sidelobes, which varies as v~*. Because of various simplifying assumptions, the results in this 
figure are only approximate but also provide an indication of the relative vulnerability of different 
types of observations. 


Note that the harmful threshold increases with frequency as v? as a result of the 
dependence of the sidelobe collecting area. With increasing frequency, the system 
temperature and the usable bandwidth also generally increase. Expressed in spectral 
power flux density, the corresponding threshold level, S, (W m~? Hz), is 


_ Fa _ 0.4rkT sv? 
~ Av c2/t, Av. 


To determine the harmful interference level for continuum observations within 
a band allocated to radio astronomy, Av is usually taken to be the width of 
the allocated band. The total-power type of radio telescope is the most sensitive 
to interference. Thus, the results in Eqs.(16.3) and (16.4) provide a worst-case 
specification for the harmful thresholds of interference for radio astronomy. Values 
of Fp and Sp computed for total-power systems using typical parameters for the 
various radio astronomy bands are given in ITU-R! documentation (ITU-R 2013). 
For Sn, the values are plotted as the bottom curve in Fig. 16.2. Since much of the 


Sh (16.4) 


‘ITU-R denotes the Radiocommunication Sector of the International Telecommunication Union. 
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interference to radio astronomy results from broadband spurious emissions, Sp is 
particularly useful. 

Low-level interference, of amplitude comparable to the noise in the receiver 
output, degrades the sensitivity and impedes the ability to detect weak sources. Thus, 
in observations in which interference has occurred, it is often necessary to delete any 
data that appear to be corrupted. The analysis that follows considers the response to 
interference resulting from basic methods of observation and data reduction and 
does not include procedures designed specifically for mitigation of interference. 


16.3.1 Short- and Intermediate-Baseline Arrays 


We now consider the interference response of a correlator array with antenna 
spacings from a few meters to a few tens of kilometers. Two effects reduce the 
response to interference compared with that of a total-power system. First, the 
source of interference does not move across the sky with the sidereal motion of 
the object under observation, and thus it produces fringe oscillations of a different 
frequency from those of the wanted signal. Second, the instrumental delays are 
adjusted to equalize the signal paths for radiation incident from the direction of 
observation, and signals from another direction, if they are broadband, are to some 
extent decorrelated. The following analysis is based on Thompson (1982). 


16.3.2 Fringe-Frequency Averaging 


Consider first the fringe-frequency effect. Suppose that instrumental phase shifts are 
introduced, as described in Sect. 6.1.6, to slow the fringe oscillations of the wanted 
signal to zero frequency. The removal of the fringe-frequency phase shifts from the 
cosmic signals introduces corresponding shifts into the interfering signals. If the 
source of interference is stationary with respect to the antennas, the interference at 
the correlator output has the form of oscillations at the natural fringe frequency for 
the source under observation, which from Eq. (4.9) (omitting the sign of dw/dt) is 


Vf = @eucosd . (16.5) 


Here œe is the angular rotation velocity of the Earth, u is a component of antenna 
spacing, and 6 is the declination of the source under observation. Averaging of such 
a fringe-frequency waveform for a period Ta is equivalent to convolution with a 
rectangular function of width t,. The amplitude is thus decreased by a factor that 
follows from the Fourier transform of the convolving function. This factor is 


sin (7 VfTa) (16.6) 
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In order to estimate a harmful threshold for interference (F), we compute the ratio 
of the rms level of interference to the rms level of noise in a radio image and, as 
before, equate the result to 0.1. The first step is to determine the mean squared value 
of the modulus of the interference component in the visibility data. Figure 6.7b, 
which depicts the spectral components at the correlator output, shows that the output 
from the correlated signal component, in this case the interference, is represented 
by a delta function. Assuming, as before, that the interference enters sidelobes of 
gain 0 dBi and that the polarization is matched, we substitute in the magnitude of 
the delta function kT, Av = F),c?/4zv. Thus, the sum of the squared modulus of 
the interference over n, grid points in the (u, v) plane is 


Di = (ES) lh 67 


Here r; is the correlator response to the interference, Ho is a voltage gain factor, and 
( fi) is the mean squared value of fi, as given in Eq. (16.6), which represents the 
effect of the visibility averaging on the fringe-frequency oscillations. To determine 
the mean squared value of fı, a simple approach is to consider the variation of this 
factor in the (u’, v’) plane in which the antenna-spacing vector rotates with constant 
angular velocity œw, and sweeps out a circular locus, as described in Sect. 4.2. Also, 
suppose that to interpolate the values of visibility at the rectangular grid points 
in the (u, v) plane, the measured values are averaged with uniform weight within 
rectangular cells centered on the grid points (see the description of cell averaging 
in Sect. 5.2.2). Then the effective averaging time t for the interference is equal to 
the time taken by the baseline vector to cross a cell, as shown in Fig. 16.3. Note 
from Eq. (16.5) that the fringe frequency goes through zero at the v’ axis, and f is 
then unity. For small values of w, as defined in Fig. 16.3, the path length through 
a cell is closely equal to Au, and the cell crossing time is t = Au/a@,q', where 


qg = VX? + Y?, and where X, and Y, are the components of antenna spacing 
measured in wavelengths and projected onto the equatorial plane, as defined in 
Sect. 4.1. Also, vet = Ausin y cos. Now Au is equal to the reciprocal of the 
width of the synthesized field, which, except at long wavelengths, is unlikely to be 
more than ~ 0.5°. We therefore assume that Au is of order 100 or greater, which 
permits the following simplification. For Au = 100 and ê < 70°, f? goes from 1 
to 107° as y goes from 0 to < 17°. Thus, most of the contribution to f? occurs for 
small y, and we can substitute vet = yAucos ô in Eq. (16.6) and obtain 


(fr) (16.8) 


_2 m sin? (x Au cos 4) 2 1 
= 


(x wAucos 8)? ~~ mAucosô ` 
Since Au is large, we have used an upper limit of oo in evaluating the integral. 


For the noise, we again refer to Fig.6.7b. The power spectral density of the 
noise near zero frequency is Hy I? Ty Av , and an equivalent bandwidth t~!, including 
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Fig. 16.3 Derivation of the mean cell crossing time for the spatial frequency locus indicated by 
the broken line. The velocity of the spatial frequency vector in the (u’, v’) plane is wq’. The mean 
path length through the cell in the direction of the broken line is the cell area Au’ Av’ divided by 
the cell width projected normal to that direction. 


negative frequencies, is passed by the averaging process; see Eq. (6.44). The mean- 
squared component of the noise over the n, grid points is thus 


XO (ral)? = Hoe TZ Avn, (<1) , (16.9) 


ny 
where (t~') is the mean value of t~!. From Fig. 16.3, the mean cell crossing time is 


Au |cosec ô| 


= —— 16.10 
: q' We (| sin Y| + |cosec ô| | cos y|) ( ) 


We have assumed that Au’ = Av’ sinô (i.e., Au = Av) and that for all except a 
small number of cells, the path of the spatial frequency locus through a cell can be 
approximated by a straight line. The mean value of t~! around a locus in the (u’, v’) 
plane (see Sect. 4.2) is, from Eq. (16.10), 


2 /2 
=| tldp = 
T Jo 
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and the mean for the n, points in the (u, v) plane is 


(7!) = 


ae (16.12) 


Ny 


From Eggs. (16.7)—(16.9) and Eq. (16.12), the interference-to-noise ratio is 


(\ri|)rms z= ee. ee 


CraQans: 4nkTsv2./2Ave, cos 61 + | sind joe -En d! 


By Parseval’s theorem, the ratio of the rms values of the interference and noise in 
the image is equal to the same ratio in the visibility domain, which is given by 
Eq. (16.13). To evaluate the harmful threshold F}, we equate the right side to 0.1 


and obtain 
0.42kTsv2./2Ava, | 1 
F, = e -5 a. (16.14) 
C Ny Pa 


The factor ycosô(1 + | sinô has been replaced by unity, the resulting error being 
less than 1 dB for O < || < 71°, and 2.3 dB for ô = 80°. The number of points 
in the (w’, v’) plane to which an antenna pair contributes is proportional to q’, so in 
evaluating Eq. (16.14), it is convenient to write 


(16.13) 


1 din 1 
— f : (16.15) 
ny 7 i q 
where n, is the number of correlated antenna pairs in the array. 
The interference threshold S}, in units of dBW m~? Hz", is given by 
F 0.40kTsv?./20- 
S, = = L pee (16.16) 


Av eJ Av Nr 


Note that q’ is proportional to v, so Sp is proportional to v?>. Values of S, 
for the VLA are shown by two middle curves in Fig. 16.2, which correspond to 
configurations in which the maximum baselines are 35 and 1 km, respectively (see 
Fig. 5.17b). 

Since the averaging is ineffective in reducing the interference when u goes 
through zero, visibility values containing the greatest contributions from interfer- 
ence cluster around the v axis. Some degree of randomness in the occurrence of 
high values is to be expected, as a result of the varying sidelobe levels through 
which the interference enters. Because of the (u, v) distribution, the interference 
in the image domain takes the form of quasi-random structure that is elongated 
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in the east-west direction; for an example, see Thompson (1982). The clustering 
also suggests the possibility of reducing the interference response by deleting any 
questionable visibility data near the v axis. The resulting degradation of the (u, v) 
coverage would increase the sidelobes of the synthesized beam. 

The discussion above applies to cases in which the observation is of sufficiently 
long duration that the (u, v) plane is well sampled, and in which the strength of the 
interfering signal remains approximately constant during this time. If only a fraction 
a of the (u, v) loci cross the v axis, then a factor of ./a should be introduced into the 
denominators of Eqs. (16.14) and (16.16). Strong, sporadic interference can produce 
different responses from that considered above. 


16.3.3 Decorrelation of Broadband Signals 


Since interfering signals are usually incident from directions other than that of the 
desired radiation, their time delays to the correlator inputs are generally not equal. 
Broadband interfering signals are thereby decorrelated to some extent, which further 
reduces their response. The reduction is not amenable to a general-case analysis like 
that resulting from averaging of the fringe frequency, but it can be computed for 
each particular antenna configuration and position of the interfering source. For this 
reason, and the fact that only broadband signals are reduced, the effect has not been 
included in the threshold equations (16.14) and (16.16). 

At any instant during an observation, let 6, be the angle between a plane normal to 
the baseline for a pair of antennas and the direction of the source under observation. 
0, defines a circle on the celestial sphere for which the delays are equalized. 
Similarly, let 0; be the corresponding angle for the source of interference. The delay 
difference for the interfering signals at the correlator is 


D | sin ô; — sin 6;| 
Ta = ———, (16.17) 
c 


where D is the baseline length. Expressions for 0, and 0; can be derived from 
Eq. (4.3), since sin 6s = wà /D, where w is the third spacing coordinate as shown 
in Fig. 3.2, and A is the wavelength. Suppose that the received interfering signal has 
an effectively rectangular spectrum of width Av and center frequency vo, defined 
either by the signal itself or by the receiving passband. By the Wiener—Khinchin 
relation, the autocorrelation function of the signal is equal to 


sin(x Avtq) 


cos(27 VoTa) . (16.18) 
mAVT 


Expression (16.18) represents the real output of a complex correlator as a function of 
the differential delay t4. The imaginary output is represented by a similar expression 
in which the cosine function is replaced by a sine. Thus, the decorrelation of the 
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modulus of the complex output for a delay t4 is given by the factor 


sin(x Avty) 
h= ——_ . (16.19) 
mAVtTg 


For a fixed transmitter location, 6; remains constant, but 6, varies as the antennas 
track. Thus, ty may go through zero, causing fọ to peak, but unlike fı in Eq. (16.6), 
a peak in f) can occur at any point on the (u,v) plane. Those antenna pairs for 
which the fı and fọ peaks overlap contribute most strongly to the interference in the 
image, and those for which the peaks are well separated contribute less. Therefore, 
for broadband signals, the fringe-frequency and decorrelation effects should be 
considered in combination. For example, in calculations for the response of the VLA 
to a geostationary satellite on the meridian, a factor 


ARR 
qlilz 16.20 
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was computed that represents the additional decrease in the rms interference 
resulting from decorrelation (Thompson 1982). The summations in (16.20) were 
taken over all antenna pairs for equal increments in hour angle, and the q’ factors 
were inserted to compensate for the uneven density of the sampled points in the 
(u, v) plane. The antenna spacings of the VLA for both the most compact and most 
extended configurations were considered, with observing frequencies from 1.4 to 
23 GHz and bandwidths of 25 and 50 MHz. The results indicate that suppression 
of broadband interference by decorrelation varies from 4 to 34 dB, with strong 
dependence on the observing declination. The interference was assumed to extend 
uniformly across the bandwidth, which would tend to overestimate the suppression 
in a practical situation. 


16.4 Very-Long-Baseline Systems 


In VLBI arrays, in which the antenna spacings are hundreds or thousands of 
kilometers, the output resulting from correlated components of an interfering signal 
at the correlator inputs is usually negligible. This is because the natural fringe 
frequencies are higher than those in arrays with baselines up to a few tens of 
kilometers, and the delay inequalities for signals that do not come from the direction 
of observation are also much greater. Furthermore, unless the interfering signal 
originates in a satellite or spacecraft, it is unlikely to be present at two widely 
separated locations. 

Consider an interfering signal entering one antenna of a correlated pair. The 
interference reduces the measured correlation, and the overall effect is similar to 
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Correlator 


Fig. 16.4 Components of the correlator input signals used in the discussion of the effects of 
interference on VLBI observations. 


an increase in the system noise for the antenna. In Fig. 16.4, x(t) and y(t) represent 
the signals plus system noise from two antennas in the absence of interference, and 
z(t) represents an interfering signal at one antenna. The three waveforms have zero 
means, and the standard deviations are o for x and y and o; for z. In the absence of 
interference, the measured correlation coefficient is 


j= an, (16.21) 


()(y?) 


When the interference is present, the correlation becomes 


po = — wta _ : (16.22) 
(x) ({y?) + 2(yz) + (2) 
The interference is uncorrelated with x and y, so (xz) = (yz) = 0. Also, at the 
harmful threshold, o? < o°. Thus, from Eqs. (16.21) and (16.22), 
1 Oi 2 
po p [i = (—) | . (16.23) 


The interference reduces the measured correlation. In a system with automatic level 
control (ALC), the reduction in correlation can be envisaged as resulting from a 
reduction in the system gain in response to the added power of the interference. 
The error introduced in the correlation measurement therefore takes the form of a 
multiplicative factor, rather than an additive error component. Interference causes 
additive errors in single antennas or arrays that have short enough baselines that 
the detector or correlator responds directly to the interfering signal. The different 
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effects of these two types of error have been discussed in Sect. 10.6.3. In principle, 
the change in the effective gain can be monitored by using a calibration signal, 
as discussed in Sect. 7.6. However, such a calibration process could be difficult if 
the strength of the interference varies rapidly. The harmful interference threshold 
should therefore be specified so it is just small enough that the errors introduced do 
not significantly increase the level of uncertainty in the measurements. In general, a 
value of 1% for variations in the visibility amplitude resulting from interference is 
a reasonable choice. If we include the possibility of simultaneous but uncorrelated 
interference in both antennas, the resulting condition is 


(=) < 0.01. (16.24) 


It follows from Parseval’s theorem, that a 1% rms error in the visibility introduces 
into the intensity an error of which the rms over the image is 1% of the corre- 
sponding rms of the true intensity distribution. The effect on the dynamic range 
of intensity within the image depends on the form of the intensity distribution and 
of the error distribution. For an image of a single point source, the rms intensity 
error would be about 107? ,/f/n, times the peak intensity, where f is the fraction of 
the n, gridded visibility data that contain interference. Here it is assumed that the 
fluctuations in the received interfering signal are sufficiently fast that the values of 
the interference level are essentially independent for each gridded visibility point. If 
this is not the case, the resulting error will be greater. 

To comply with the criterion in Eq.(16.24), the ratio of the powers of the 
interference to system noise as given by Eq.(16.2) must not exceed 0.01. Thus, 
for the harmful threshold, we have 


0.04xkTsv2A 
ea ala (16.25) 
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The interference threshold in units of W m~? Hz7! is 


Fn  0.04rkTsv? 


Sh 


Note that the interference-to-noise ratio of 0.01 here refers to the levels at the 
correlator input. In the case of total-power systems (single antennas) and the 
arrays considered in Sect. 16.3, for which the errors are additive, the criterion 
of an interference-to-noise ratio of 0.1 applies to the time-averaged output of the 
correlator or detector. This therefore results in lower (i.e., more stringent) thresholds 
than those for VLBI in Eqs. (16.25) and (16.26). A curve for VLBI is shown in 
Fig. 16.2, using typical values for Ts. The harmful thresholds are approximately 
40-50 dB less stringent than those for total-power systems. 
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16.5 Interference from Airborne and Space Transmitters 


In application of the Fp and Sp values obtained above, it was assumed that the 
angular distance between the pointing direction of the antenna beams and the 
direction of the source of interference is large enough that the interference enters 
through sidelobes of gain ~ 1 dB or less, i.e., that the angular distance is ~ 19° 
or more. Thus, airborne and satellite transmitters present a special problem. Radio 
astronomy cannot share bands with space-to-Earth (downlink) transmissions of 
satellites. However, because of the pressure for more spectrum for communications, 
allocations have been made in bands adjacent or close to those allocated to radio 
astronomy. Spurious emissions from satellite transmitters that fall outside the 
allocated band of the satellite pose a very serious threat to radio astronomy. Motion 
of the transmitter across the sky is most likely to increase the fringe frequency 
at the correlator outputs of a synthesis array and thereby reduce the response 
to interference. On the other hand, these signals may be received in high-level 
sidelobes near the main beam. 

Examples of spurious emissions that extend far outside the allocated band of the 
satellite system are described by Galt (1990) and Combrinck et al. (1994). In these 
cases, the spurious emission resulted largely from the use of simple phase shift 
keying for the modulation, and newer techniques (e.g., Gaussian-filtered minimum 
shift keying) provide much sharper reduction in spectral sidebands (Murota and 
Hirade 1981; Otter 1994). However, intermodulation products resulting from the 
nonlinearity of amplifiers carrying many communication channels can remain a 
problem. 

In some cases, operating requirements and limitations associated with space 
tend to make reduction of spurious emissions difficult. Some satellites use a 
large number of narrow beams to cover their area of operation, so that the same 
frequency channels can be used a corresponding number of times to accommodate 
a large number of customers. This requires phased-array antennas with many (of 
order 100 or more) small radiating elements, each with its own power amplifier 
[see, e.g., Schuss et al. (1999)]. Because of power limitations from the solar 
cells, these amplifiers are operated at levels that maximize power efficiency but 
could compromise linearity, resulting in spurious emissions from intermodulation 
products. 

The recommended limits on spurious emissions (ITU-R 2012) in effect require 
that, for space services, the power in spurious emissions measured in a 4-kHz band 
at the transmitter output should be no more than —43 dBW. Thus, for example, 
spurious emission at this level from a low-Earth-orbit satellite at 800 km height 
and radiated from a sidelobe of 0 dBi gain would produce a spurious spectral power 
flux density of -208 dBW m~? Hz"! at the Earth’s surface. This figure may be 
compared with the harmful interference thresholds for radio astronomy of —239 and 
-255 dBW m ~? Hz‘! for spectral line and continuum measurements, respectively, 
at 1.4 GHz. Although this very simple calculation considers only the worst-case 
situation, the differences of several tens of decibels show that such limits do not 
effectively protect radio astronomy. 
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Regulation of the usage of the radio spectrum is organized through ITU, based in 
Geneva, which is a specialized agency of the United Nations. Radio astronomy was 
first officially recognized as a radiocommunication service by the ITU in 1959. The 
ITU-R was created in March 1993 and replaced the International Radio Consultative 
Committee (CCIR), an earlier entity within the ITU. A system of study groups 
within the ITU-R is responsible for technical matters. Study Group 7, entitled 
Science Services, includes radio astronomy, various aspects of space research, 
environmental monitoring, and standards for time and frequency. Study groups 
are subdivided into working parties that deal with specific areas. Their primary 
function is to study problems of current importance in frequency coordination, for 
example, specific cases of sharing of frequency bands between different services, 
and to produce documented Recommendations on the solutions. Decisions within 
the ITU are made largely by consensus. Recommendations must be approved by 
all of the radiocommunication study groups and then effectively become part of the 
ITU Radio Regulations. Recommendations in the RA series are specific to radio 
astronomy. 

The ITU-R organizes meetings of study groups, working parties, and other 
groups required from time to time to deal with specific problems. It also organizes 
World Radiocommunication Conferences (WRCs) at intervals of two to three years, 
at which new spectrum allocations are made and the ITU Radio Regulations 
are revised as necessary. Administrations of many countries send delegations to 
WRCs, and the results of these conferences have the status of treaties. Participating 
countries can take exceptions to the international regulations so long as these do 
not affect spectrum usage in other countries. As a result, many administrations have 
their own system of radio regulations, based largely on the ITU Radio Regulations 
but with exceptions to accommodate their particular requirements: see Pankonin 
and Price (1981) and Thompson et al. (1991). See also ITU-R Handbook on Radio 
Astronomy (2013) and ITU-R Recommendation RA.769-2 (2003). 
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Chapter 17 
Related Techniques 


Concepts and techniques similar to those used in radio interferometry and synthesis 
imaging occur in various areas of astronomy, Earth remote sensing, and space 
science. Here we introduce a few of them, including optical techniques, to leave the 
reader with a broader view. All of these subjects are described in detail elsewhere, 
so here the aim is mainly to outline the principles involved and to make connections 
between them and the material developed in earlier chapters. 


17.1 Intensity Interferometer 


In long-baseline interferometry, the intensity interferometer offers some technical 
simplifications that were mainly of importance in radio astronomy during the early 
development of the subject. As mentioned in Sect. 1.3.7, its practical applications in 
radio astronomy have been limited (Jennison and Das Gupta 1956; Carr et al. 1970; 
Dulk 1970) because, in comparison with a conventional interferometer, an intensity 
interferometer requires a much higher signal-to-noise ratio (SNR) in the receiving 
system, and only the modulus of the visibility function is measured. This type of 
interferometer was devised by Hanbury Brown, who has described its development 
and application (Hanbury Brown 1974). 

In the intensity interferometer, the signals from the antennas are amplified and 
then passed through square-law (power-linear) detectors before being applied to a 
correlator, as shown in Fig. 17.1. As a result, the rms signal voltages at the correlator 
inputs are proportional to the powers delivered by the antennas and therefore also 
proportional to the intensity of the signal. No fringes are formed because the phase 
of the radio frequency (RF) signals is lost in the detection, but the correlator output 
indicates the degree of correlation of the detected waveforms. Let the voltages at 
the detector inputs be V; and V2. The outputs of the detectors are y? and V5 and 
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Fig. 17.1 The intensity interferometer. The amplifier and filter block may also incorporate a local 
oscillator and mixer. The compensating delay equalizes the time delays of signals from the source 
to the correlator inputs. The post-detector filters remove dc and radio frequency components. 
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each consists of a dc component, which is removed by a filter, and a time-varying 
component, which goes to an input of the correlator. From the fourth-order moment 
relation [Eq. (6.36)], the correlator output is 


(Vi — (WP) (VE — (V3) = (VPVS) — (VP (V2) 
= 2(V, V2)" . (17.1) 


The correlator output is proportional to the square of the correlator output for a 
conventional interferometer and measures the squared modulus of the visibility of a 
source under observation. 

We now give an alternative derivation of the response, which provides a physical 
picture of how the signals from different parts of the source combine within the 
instrument. The source is represented as a one-dimensional intensity distribution 
in Fig. 17.2. We suppose that it can be considered as a linear distribution of many 
small regions, each of which is large enough to emit a signal with the characteristics 
of stationary random noise, but of angular width small compared with 1/u, which 
defines the angular resolution of the interferometer. The source is assumed to be 
spatially incoherent so the signals from different regions are uncorrelated. Consider 
two regions of the source, k and £, at angular positions 6, and 6, and subtending 
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Source 


1 2 


Fig. 17.2 Distances and angles used in the discussion of the intensity interferometer. u is the 
projected antenna spacing in wavelengths. 


angles d@, and d@, as in Fig. 17.2. Each radiates a broad spectrum, but we first 
consider only the output resulting from a Fourier component at frequency vg 
from region k and similarly a component at vg from region £. Let A; (0) be the 
power reception pattern of the two antennas and /;(@) the intensity distribution of 
the source, these two functions being one-dimensional representations. Then the 
detector output of the first receiver is equal to 


[Vp cos 2a vgt + Ve cos(2a vet + Q)? , (17.2) 


where ¢, is a phase term resulting from path-length differences, and the signal 
voltages Vz and V; are given by 


VŽ = As (0) (0) dO; dvg (17.3) 
and 


V? = A1 (00) (0t) dO dve . (17.4) 
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After expanding (17.2) and removing the dc and RF terms, we obtain for the detector 
output from receiver 1: 


ViVe cos [27 (vk — ve)t — Gi] . (17.5) 
Similarly, the detector output from receiver 2 is 
ViVe cos [27 (vk — ve)t — G2] . (17.6) 


The correlator output is proportional to the time-averaged product of (17.5) 
and (17.6), that is, to 


(A1 (O)A1 (Oe) Li (x) (Oc) dO dOe dvg dve cos(ġı — 2)) . (17.7) 


The change in the phase term with respect to frequency is small so long as the 
fractional bandwidth is much less than the ratio of the resolution to the field of view 
[see Eq. (6.69) and related discussion]. With this restriction, expression (17.7) is 
effectively independent of the frequencies vg and vg, so that if we integrate it with 
respect to v and ve over a rectangular receiving passband of width Av, dvg dve is 
replaced by Av?. 

The phase angles ¢, and @» result from the path differences kk’ and ££’ shown 
in Fig. 17.3. Note that ¢; and ¢2 have opposite signs since the excess path length to 
antenna 1 is from point £ and that to antenna 2 is from point k. If R, is the distance of 
the sources from the antennas, the distance k£ in the source is approximately equal 
to R;(0;—@¢). The angle a, +a, is approximately equal to uA /R,, since u represents 
the antenna spacing projected normal to the source and measured in wavelengths. 
The preceding approximations are accurate if a, a, and the angle subtended by the 
source are all small. Thus, the difference of the phase angles is 


(sin a, + sin œe) 
À 
~ 2ru(Ok— 6e) . (17.8) 


bi — p2 = 2T R, (0k — 0e) 


Source 


1 2 1 2 


Fig. 17.3 Relative delay paths kk’ and ££’ from regions k and £ of the source for rays traveling in 
the directions of antennas | and 2. 
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From (17.7), the output of the correlator now becomes 
(Aj (0A (CDI (O (Be) Av? cos [27 u(, — O¢)] dO, dr) . (17.9) 


To obtain the output from all pairs of regions within the source, expression (17.9) 
can, with the assumption of spatial incoherence, be integrated with respect to 6 and 
6e over the source, giving 


([a» faon cos(26 6 [av f areon cos(2ub) | 


+ [av faon (0%) sn(2ar6.)6 | [av f areon sin(2zuði)dð |) 
=A Av [VR + V] = 4A V, (17.10) 


where we assume that the antenna response A;(@) has a constant value Ap over 
the source, and the subscripts R and I denote the real and imaginary parts of the 
visibility. This result follows from the definition of visibility that is given for a two- 
dimensional source in Sect. 3.1.1. Thus, the correlator output is proportional to the 
square of the modulus of the complex visibility. For a more detailed discussion 
following the same approach, see Hanbury Brown and Twiss (1954). An analysis 
based on the mutual coherence of the radiation field is given by Bracewell (1958). 

Some characteristics of the intensity interferometer offer advantages over the 
conventional interferometer. The intensity interferometer is much less sensitive to 
atmospheric phase fluctuations, because each signal component at the correlator 
input is generated as the difference between two radio frequency components that 
have followed almost the same path through the atmosphere. The phase fluctuations 
in the difference-frequency components at the detectors are less than those in 
the radio frequency signals by the ratio of the difference frequency to the radio 
frequency, which may be of order 10~°. In the conventional interferometer, such 
phase fluctuations can make the amplitude, as well as the phase, of the visibility 
difficult to measure. Similarly, fluctuations in the phases of the local oscillators 
in the two receivers do not contribute to the phases of the difference-frequency 
components. Thus, it is not necessary to synchronize the local oscillators or even to 
use high-stability frequency standards, as in VLBI. These advantages were helpful, 
although by no means essential, in the early radio implementation of the intensity 
interferometer. Had the diameters of the sources under investigation then been 
of order of arcseconds, rather than arcminutes, the characteristics of the intensity 
interferometer would have played a more essential role. 

The serious disadvantage of the intensity interferometer is its relative lack of 
sensitivity. Because of the action of the detectors in the receivers, the ratio of the 
signal power to the noise power at the correlator inputs is proportional to the square 
of the corresponding ratio in the RF (predetector) stages, the exact value being 
dependent on the bandwidths of these and the post-detector stages (Hanbury Brown 
and Twiss 1954). In a conventional interferometer, it is possible to detect signals that 
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are ~ 60 dB below the noise at the correlator inputs. In the intensity interferometer, 
a similar SNR at the correlator output would require SNRs greater by ~ 30 dB 
in the RF stages. This effect, together with the lack of sensitivity to the visibility 
phase, has greatly restricted the radio usage of the intensity interferometer. Intensity 
interferometry played a similar role in the early days of optical interferometry (see 
Sect. 17.6.3) before the development of the modern Michelson interferometer. 


17.2 Lunar Occultation Observations 


Measurement of the light intensity from a star as a function of time during 
occultation by the Moon was suggested by MacMahon (1909) as a means of 
determining the star’s size and position. His analysis, which was based on a simple 
consideration of geometric optics, was criticized by Eddington (1909), who stated 
that diffraction effects would mask the detail at the angular scale of the star. 
Eddington’s paper probably discouraged observations of lunar occultations for some 
time. The first occultation measurements were reported 30 years later by Whitford 
(1939), who observed the stars 6 Capricorni and v Aquarii and obtained clear 
diffraction patterns. 

What was not realized by Eddington and others at the time was that although 
the temporal response to an occultation is not a simple step function, as it would be 
for the case of geometrical optics and a point source, the Fourier transform of the 
point-source response, which represents the sensitivity to spatial frequency on the 
sky, has the same amplitude as that of a step function and differs only in the phase. 
Hence, the lunar occultation is sensitive to all Fourier components, and there is no 
intrinsic limit to the resolution that can be obtained, except for that imposed by the 
finite SNR. This equality of the amplitudes was recognized by Scheuer (1962), who 
devised a method of deriving the one-dimensional intensity distribution 7, from the 
occultation curve. By that time, the concept of spatial frequency had become widely 
understood through application to radio interferometry. Since, in lunar occultations, 
the diffraction occurs outside the Earth’s atmosphere, the high angular resolution 
is not corrupted significantly by atmospheric effects, as it is in the case of ground- 
based interferometry. Furthermore, the only dependence of the obtainable resolution 
on the telescope size results from the SNR. An early radio application of the 
technique was the measurement of the position and size of 3C273 by Hazard et al. 
(1963), which led to the identification of quasars. As mentioned in Sect. 12.1, this 
position measurement was used for many years as the right ascension reference for 
VLBI position catalogs. Radio occultation measurements have been most important 
at meter wavelengths, since at shorter wavelengths, the high thermal flux density 
from the Moon presents a difficulty. At radio frequencies, lunar occultations have 
been largely superseded by interferometry, but lunar occultations are still useful at 
optical and infrared wavelengths. 


17.2 Lunar Occultation Observations 815 


(a) \ 
\ Source 


(b) , 
Received 


power 


Fig. 17.4 Occultation of a radio source by the Moon: (a) the geometrical situation, in which 0 is 
measured clockwise from the direction of the source, and is negative as shown; (b) the occultation 
curve for a point source, which is proportional to P(@). The units of @ on the abscissa are equal to 
VA/2Rm, where A is the wavelength and R,,, the Moon’s distance. 


Figure 17.4 shows the geometrical situation and the form of an occultation 
record. The departure of the Moon’s limb from a straight edge, as a result of 
curvature and roughness, is small compared with the size of the first Fresnel zone 
at radio frequencies. Thus, the point-source response is the well-known diffraction 
pattern of a straight edge, which is derived in most texts on physical optics. The 
main change in the received power in Fig. 17.4b corresponds to the covering or 
uncovering of the first Fresnel zone by the Moon, and the oscillations result from 
higher-order zones. The critical scale is the size of the first Fresnel zone, y (AR»/2), 
where Rm ~ 3.84 x 10° km is the Earth-Moon distance. This corresponds to 4400 m 
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at 10-cm wavelength and 10 m at 0.5 um, or 2.3” and 5 mas, respectively, in angle 
as seen from the Earth. The maximum velocity of the occulting edge of the Moon 
is approximately 1 km s~!, but the effective velocity depends on the position of the 
occultation on the Moon’s limb, and we use 0.6 km s_! as a typical figure. Thus, the 
coverage time of the first Fresnel zone, which determines the characteristic fall time 
and oscillation period, is typically about 7 s at a wavelength of 10 cm and 16 ms at 
0.5 wm. 

In the case of the hypothetical geometrical-optics occultation, the observed curve 
would be the integral of J; as a function of 0, the angle between the source and the 
Moon’s limb as in Fig. 17.4a. Then /; could be obtained by differentiation. In the 
actual case, the observed occultation curve G(0) is equal to convolution of 7; (0) 
with the point-source diffraction pattern of the Moon’s limb (6). This convolution 
is 71 (0) x P(A). Differentiation with respect to 0 yields 


G (0) = (0) x P'(8) , (17.11) 


where the primes indicate derivatives. Fourier transformation of the two sides of 
Eq. (17.11) gives 


Gu) =u) P (u), (17.12) 


where the bar indicates the Fourier transform, the prime indicates a derivative in the 
0 domain, and u is the conjugate variable of @. 

Now in the geometrical-optics case, P(O) would be a step function, and thus 
P' (0) would be a delta function for which the Fourier transform is a constant. For 
the diffraction-limited case, the function P(u) {adapted from Cohen (1969)] is given 
by 


P(u) = L exp [j27 0u” sgn u] ; (17.13) 
u 


where Op is the angular size of the first Fresnel zone, y/A/2Rm, and sgn is the 
sign function, which takes values +1 to indicate the sign of u. It follows from 
the derivative theorem of Fourier transforms that P’(u) = j2xP(u), which has a 
constant amplitude with no zeros and can be divided out from Eq. (17.12). Thus, 
Iı (0) is equal to G’(@) convolved with a function whose Fourier transform is 
1/P"(u). Scheuer (1962) shows that this last function is proportional to P’(—6), 
which can be used as a restoring function as follows: 


(9) = G' (0) * P'(—0) 
= G(0)*P"(-6). (17.14) 


The second form on the right side is more useful since it avoids the practical 
difficulty of differentiating a noisy occultation curve. In principle, this restoration 
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provides J, without limit on the angular resolution, in contrast with the performance 
of an array. Remember, however, that the amplitude of the spatial frequency sensi- 
tivity of the occultation curve, which is given by Eq. (17.13), is proportional to 1/u. 
Thus, in the restoration in Eq. (17.14), the amplitudes of the Fourier components, 
which also include the noise, are increased in proportion to u. The increase of the 
noise sets a limit to the useful resolution. This limit can be conveniently introduced 
by replacing P” (0) in Eq. (17.14) by P” (0) convolved with a Gaussian function of 
0 0 with a resolution 40. One then derives J; as it would be observed with a beam 
of the same Gaussian shape. In practice, the introduction of the Gaussian function 
is essential to the method, since it ensures the convergence of the convolution 
integral in Eq. (17.14). The optimal choice of A@ depends on the SNR. Examples 
of restoring functions for various resolutions can be found in von Hoerner (1964). 

The discussion above follows the classical approach to reduction of Moon- 
occultation observations, which developed from the geometrical optics analogy. One 
can envisage the reduction more succinctly as taking the Fourier transform of the 
occultation curve, dividing by P(u) (with suitable weighting to control the increase 
of the noise), and retransforming to the 9 domain. This process is mathematically 
equivalent to that in Eq. (17.14). 

An estimate of the noise-imposed limit on the angular resolution can be obtained 
using the geometrical optics model, since the SNRs of the Fourier components 
are the same as for the actual point-source response. Consider the region of an 
occultation curve (see Fig. 17.4b) in which the main change in the received power 
occurs, and let t be a time interval in which the change in the record level is equal 
to the rms noise. Then if vm is the rate of angular motion of the Moon’s limb over 
the radio source, the obtainable angular resolution is approximately 


AO = UmT . (17.15) 


During the interval t, the flux density at the antenna changes by AS. Let 0, be the 
width of the main structure of the source in a direction normal to the Moon’s limb, 
and let S be the total flux density of the source. Then for a source of approximately 
similar dimension in any direction, the average intensity is approximately S/0?. The 
change in solid angle of the covered part of the source in time t is 6,40, and 


40 AS 
— x —. 17.16 
7, 5 ( ) 
The SNR at the receiver output for a component of flux density AS is 
AASV A 
=. (17.17) 


QkTs 


where A is the collecting area of the antenna, Av and Ts are the bandwidth and 
system temperature of the receiving system, and k is Boltzmann’s constant. Note 
that the thermal contribution from the Moon can contribute substantially to Ts. The 
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conditions that we are considering correspond to Rsn ~ 1, and from Eqs. (17.15)— 
(17.17), we obtain 


QkT50,\ 7/7 / Um \ 13 
Ab =( S ) (=) (17.18) 
Note that frequency (or wavelength) does not enter directly into Eq. (17.18), but the 
values of several parameters, for example, S, Av, and Ts, depend upon the observing 
frequency. As an example, consider an observation at a frequency in the 100-300- 
MHz range for which we use A = 2000 m?, Ts = 200 K, and Av = 2 MHz. 
For an example of a radio source, we take S = 107%% W m~? Hz"! (1 Jy) and 
6, = 5”. Um is typically 0.3” s~!. With these values, Eq. (17.18) gives A@ = 0.7”. 
Although Eq. (17.18) is derived using a geometrical optics approach, this does not 
limit its applicability. For an observed occultation curve, the equivalent curve for 
geometrical optics can be obtained by adjustment of the phases of the Fourier 
components. 

The bandwidth of the receiving system has the effect of smearing out angular 
detail in an occultation observation. Thus, since the SNR increases with bandwidth, 
for any observation there exists a bandwidth with which the sensitivity to fine 
angular structure is maximized. This bandwidth is approximately v7A6? R,,/c, 
which can be derived from the requirement that the phase term in Eq. (17.13) 
not change significantly over the bandwidth. This result can be compared to the 
bandwidth limitation for an array [given by Eq. (6.70)] by noting that a measurement 
by lunar occultation with resolution 40 involves examination of the wavefront, at 
the distance of the Moon, on a linear scale of A/A@. Such an interval subtends an 
angle A/A@ Rm at the Earth. Further discussion of such details, and of the practical 
implementation of Scheuer’s restoration technique, is given by von Hoerner (1964), 
Cohen (1969), and Hazard (1976). Note that a source may undergo a number of 
occultations within a period of a few months, with the Moon’s limb traversing 
the source at different position angles. If a sufficient range of position angles is 
observed, the one-dimensional intensity distributions can be combined to obtain a 
two-dimensional image of the source [see, e.g., Taylor and De Jong (1968)]. In 
radio astronomy, the use of lunar occultations has become less important since the 
development of very-long-baseline interferometry. 

The method of lunar occultation has been widely used in optical and infrared 
astronomy to measure the size and the limb darkening of stars, and the separation 
of close binary stars. Consistency of the results with those of optical interferometry 
proves that the lunar occultation method is not corrupted by variations in the lunar 
topography, which can be expected to become important when the size of the 
variations is comparable to the Fresnel scale. Angular sizes have been routinely 
measured down to about | mas. The analysis of stellar occultation curves is usually 
done by fitting parameterized models, rather than the reconstruction methods used 
in radio observations described above. A review of special considerations for 
lunar occultation observations at optical and infrared wavelengths can be found in 
Richichi (1994). Extensive measurements of stellar diameters [see, e.g., White and 
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Feierman (1987)] and binary star separations [see, e.g., Evans et al. (1985)] have 
been made. Other applications include the measurement of subarcsecond dust shells 
surrounding Wolf-Rayet stars [see, e.g., Ragland and Richichi (1999)]. 


17.3 Measurements on Antennas 


Measurement of the electric field distribution over the aperture of an antenna is 
an important step in optimizing the aperture efficiency, especially in the case of 
a reflector antenna for which such results indicate the accuracy of the surface 
adjustment. The Fourier transform relationship between the voltage response pattern 
of an antenna and the field distribution in the aperture has been derived in 
Sect. 15.1.2. If x and y are axes in the aperture plane, the field distribution (x, y) 
is the Fourier transform of the far-field voltage radiation (reception) pattern V4 (J, m) 
(see Sect. 3.3.1), where / and m are here the direction cosines measured with respect 
to the x and y axes and the subscript A indicates measurement in wavelengths. Thus, 


CO 
Va (l, m) x J J Elx, yy) e7 0am dy dyz. (17.19) 
=00 


Direct measurement of & can be made by moving a probe across the aperture plane, 
but care must be taken to avoid disturbing the field. Such a technique is useful 
for characterizing horn antennas for millimeter wavelengths (Chen et al. 1998). 
However, in many applications, especially for large antennas on fully steerable 
mounts, it is easier to measure V4. It is necessary to measure both the amplitude 
and phase of V4 (l, m) in order to perform the Fourier transform for &(x,, y,). To 
accomplish this, the beam of the antenna under test can be scanned over the direction 
of a distant transmitter, and a second, nonscanning, antenna can be used to receive 
a phase reference signal. The function V4 (l, m) is obtained from the product of the 
signals from the two antennas. This technique resembles the use of a reference beam 
in optical holography, and antenna measurements of this type have been described 
as holographic (Napier and Bates 1973; Bennett et al. 1976). 

The holographic technique is readily implemented for measurements of antennas 
in interferometers and synthesis arrays. If the instrumental parameters (baselines, 
etc.) and the source position are accurately known, and the phase fluctuations 
introduced by the atmosphere are negligible, then for an unresolved source, 
calibrated visibility values will have a real part corresponding to the flux density of 
the source and an imaginary part equal to zero (except for the noise). If one antenna 
of a correlated pair is scanned over the source, while the other antenna continues 
to track the source, the corresponding visibility values will be proportional to the 
amplitude and phase of V4 (l, m) for the scanning antenna. Measurement of synthesis 
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array antennas as outlined above was first described by Scott and Ryle (1977), whose 
analysis, and that of D’ Addario (1982), we largely follow below. 

It is convenient to visualize the data in the aperture plane &(x), yı) and in the 
sky plane V4 (J, m) as discrete measurements at grid points in two N x N arrays to be 
used in the discrete Fourier transformation. For simplicity, consider a square antenna 
aperture with dimensions d} x d}. Since E(x, y,) is zero outside a range +d) /2, the 
sampling theorem of Fourier transforms indicates that the response must be sampled 
at intervals in (/,m) no greater than 1/d,. [This interval is twice the sampling 
interval for the power beam because the power beam is the Fourier transform of the 
autocorrelation function of S(x,, y,).] If the V4(/,m) samples are spaced at 1/d), 
the aperture data just fill the 6(4,, yı) array. The spacing of the measurements in the 
aperture is d,/N. Therefore, N is usually chosen so that the sample interval provides 
several measurements on each surface panel. In the (/, m) plane, the range of angles 
over which the scanning takes place is N times the pointing interval, that is, N/d}. 
This scan range is approximately N beamwidths. The procedure is to scan with the 
antenna under test in N? discrete pointing steps and thereby obtain the V4 (J, m) data. 

As a measure of the strength of the signal, let R,, be the SNR obtained in time 
Ta with the beams of both antennas pointed directly at the source. Now suppose 
that the (x1, ya) aperture plane is divided into square cells (as in Fig.5.3) with 
sides d,/N centered on the measurement points. Consider the contribution to the 
correlator output of the signal from one such aperture cell, of area (d,/N)*, in the 
antenna under test. The effective beamwidth of such an aperture cell is N times the 
antenna beamwidth, that is, approximately the total scan width required. Such an 
area contributes a fraction 1/N? to the signal at the correlator output, so relative 
to the noise at the correlator output, the component resulting from one aperture 
cell is Rsn/ N? in time Ta, or Ren /N in time N? Ta, which is the total measurement 
time. The accuracy of the phase measurement for the signal component from 
one aperture cell, 5@, is the reciprocal of /2 times the corresponding SNR, that 
is, N/(V2R,n). The factor s/2 is introduced because only the component of the 
system noise that is normal to the signal (visibility) vector introduces error in 
the phase measurement; see Fig.6.8. Now a displacement e€ in the surface of the 
aperture cell causes a change of phase 4zre/A in the reflected signal. Thus, an 
uncertainty ôġ in the phase of this signal component results in an uncertainty in 
e of 8e = AS$/(4) = AN/(4V27R,n). From the accuracy ĝe desired for the 
surface measurement, we determine that the signal strength should be such that the 
SNR in time t,, with both beams on source, is 


NA 
A/2n8e— 


Having determined Rn, we can use Eqs. (6.48) and (6.49) to obtain values of 
antenna temperature or flux density (W m~? Hz7') for the signal. If the two 
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antennas used are not of the same size, then in Eqs. (6.48) and (6.49), A, T4, and 
Ts are replaced by the geometric means of the corresponding quantities. Several 
simplifying approximations have been made. The statement that one aperture cell 
contributes 1 /N of the antenna output implies the assumption that the field strength 
is uniform over the aperture. If the aperture illumination is tapered, a higher value 
of Ren will be required to maintain the accuracy at the outer edges. Consideration of 
a square antenna overestimates &,, for a circular aperture of diameter d} by 4/z. 
The situation can be significantly different when the signal used in the holography 
measurement is a cw (continuous wave) tone, for example, from a satellite. The 
received signal power P can be large compared with the receiver noise kTrpAv 
(D’ Addario 1982). In that case, the noise in the correlator output is dominated 
by the cross products formed by the signal and the receiver noise voltages. The 
resulting SNR in time t is y PAv t/(kTrAv), which is independent of the receiver 
bandwidth. 

An example of holographic measurements on an antenna of a submillimeter- 
wavelength synthesis array is shown in Fig. 17.5. Some practical points are listed 
below. 


e The source used in a holographic measurement is ideally strong enough to allow a 
high SNR to be obtained. Usually, either a signal from a transmitter on a satellite 
or a cosmic maser source is used. Morris et al. (1988) describe measurements on 
the 30-m-diameter antenna at Pico de Veleta in which a measurement accuracy 
(repeatability) of 25 um was achieved using the 22.235-GHz water maser in 
Orion. For holography with interferometer elements, sources that are partially 
resolved can be used (Serabyn et al. 1991). 

e If the test antenna is on an altazimuth mount, the beam will rotate relative to the 
sky as the observation proceeds. In determining the pointing directions, the (/, m) 
axes of the sky plane should remain aligned with the local horizontal and vertical 
directions. If the antenna is on an equatorial mount, the (/, m) axes should be the 
directions of east and north on the sky plane [i.e., the usual (/, m) definition]. 

e If the source is strongly linearly polarized and the antennas are on altazimuth 
mounts, it may be necessary to compensate for rotation of the beam. This is 
possible if the antennas receive on two orthogonal polarizations. 

e When using two separate antennas, differences in the signal paths resulting from 
tropospheric irregularities can cause phase errors. It may be necessary to make 
periodic recordings with both beams centered on the source to determine the 
magnitude of such effects. In the case of measurement on a single large antenna, 
a small antenna mounted on the feed support structure of the large one, and 
pointing in the same direction as the large antenna’s beam, is sometimes used to 
provide the on-source reference signal. Tropospheric effects on the phase should 
then cancel. 
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e An antenna may be rotated (through a limited angular range) about any axis 
through its phase center without varying the phase of the received signal. The 
phase center of a parabolic reflector lies on the axis of the paraboloid and is 
roughly near the midpoint between the vertex and the aperture plane.! In the 
scanning, the maximum angle through which the antenna is turned from the on- 
source direction is N/(2d,). If the axis about which it is turned is a distance r 
from the phase center, the phase path length to the antenna will be increased by 
r[1 — cos(N/2d,)]. If this distance is a significant fraction of a wavelength, a 
phase correction must be applied to the signal at the correlator output. 

e For an antenna in a radome, structural members of which can cause scattering 
of the incident radiation, corrections are necessary. Rogers et al. (1993) describe 
such corrections for measurements on the Haystack 37-m-diameter antenna. 

e In measurements on the antennas of a correlator array in which the number of 
antennas Ma is large, a possible procedure would be to use one antenna to track 
the source and provide the reference signal and to scan all the others over the 
source. However, a better procedure would be to use n,/2 antennas to track the 
source while the other n,/2 antennas are scanned. The averaging time would be 
half that of the first procedure to allow the roles of the two sets of antennas 
to be interchanged at the midpoint of the observation. However, there would 
be n,/2 different measurements for each antenna, so compared with the first 
procedure, the sensitivity would be increased by a factor ./nq/4. Also, cross- 
correlation of the signals from the tracking antennas would provide information 
about the phase stability of the atmosphere, which would be useful in interpreting 
the measurements. 


A method that requires only measurement of the amplitude of the far-field pattern 
has been developed by Morris (1985). In such a procedure the reference antenna 
is not required. The method is based on the Misell algorithm (Misell 1973), and 
the procedure can be outlined as follows. Input requirements are an initial “first 
guess” model of the amplitude and phase of the field distribution across the antenna 
aperture, and two measurements of the far-field amplitude pattern, one with the 
antenna correctly focused and the other with the antenna defocused sufficiently 
to produce phase errors of a few radians at the antenna edge. The model aperture 
distribution is used to calculate the in-focus far-field pattern in amplitude and phase, 
and the calculated in-focus amplitude is replaced by the measured amplitude. The 
measured in-focus amplitude and the calculated phase are then used to calculate 


‘Consider transmission from an antenna in which the parabolic surface is formed by rotation of the 
parabola x = ay” around the x axis. Radiation from a ring-shaped element of the surface between 
the planes x = x’ and x = x’ + dx has an effective phase center on the x axis at x’. The area of 
such an element projected onto the aperture plane (i.e., normal to the x axis) is independent of x’. If 
the aperture illumination is uniform, each surface element between planes normal to the x axis and 
separated by the same increment makes an equal contribution to the electric vector in the far field. 
Thus, the effective phase center of the total radiation should be on the x axis, midway between 
the vertex and the aperture plane. Note that this is an approximate analysis based on geometrical 
optics. 
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the corresponding aperture amplitude and phase, which then become the new 
aperture model. This new model is then used to calculate the defocused far-field 
pattern. In calculating the defocused pattern, it is assumed that in the aperture, the 
defocusing affects only the phase and that it introduces a component that varies in 
the aperture as the radius squared. The calculated defocused amplitude pattern is 
then replaced by the measured defocused pattern, and the corresponding in-focus 
aperture distribution is calculated and becomes the new model. In the continuing 
iterations, the in-phase and defocused amplitudes are calculated alternately. After 
each calculation, the amplitude pattern is replaced by the corresponding measured 
pattern, and the result is used to upgrade the model. The required solution to which 
the procedure should converge is a model that fits both the in-focus and defocused 
responses. This technique requires a higher SNR than when phase measurements are 
made. For measurements near nulls in the beam, the required SNR is approximately 
equal to the square of that when the phase is measured (Morris 1985). 

A holographic method involving only one antenna, suitable for a large submilli- 
meter-wavelength telescope, is described by Serabyn et al. (1991). Measurements 
are made in the focal plane using a shearing interferometer, an adaptation of a 
technique used for optical instruments. 


17.4 Detection and Tracking of Space Debris 


Tracking of satellites and space debris by reception of scattered broadcast signals 
(called “noncooperative transmitters”) is known as passive radar. The technique 
generally requires a large separation between transmitter and receiver to avoid RFI 
generated by receipt of the direct transmission from the transmitter. The scattering 
cross section for a sphere of radius a is approximately 


o= ra à X 27a, 


4 
o ~ pra (5) 1 > 2na, (17.21) 


where 6 ~ 104. The short wavelength limit is called geometrical scattering, and 
the long wavelength limit is called Rayleigh scattering. These two limits are part of 
the general theory of Mie scattering [see, e.g., Jackson (1998)]. The cross section of 
dielectric spheres scales with a and A in the same way. Equation (17.21) shows that 
o/(sa’) decreases as (a/A)*, so there is a sharp decrease in sensitivity for scatters 
smaller than ~ À. The tracking of satellites and space debris is an important part of 
space situational awareness. 

The use of radio arrays to passively track space objects has been demonstrated 
by Tingay et al. (2013) with the Murchison Widefield Array (MWA). The MWA, 
which operates in the 80-300 MHz frequency range, is located in Western Australia, 
a region of low population density and radio-quiet environment. The antennas are 
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dipoles mounted at ground level, which is helpful in shielding from direct reception 
of broadcast signals in the 87.5—108 MHz FM band. The FM signals originate from 
the area several hundred kilometers distant from Perth in southwestern Australia 
and, after scattering from objects in space, have been detected by the MWA. The 
directions of the incoming signals can be measured by the array, and as a test of the 
ability to track individual objects in space, reflections from the International Space 
Station were detected. 

For this exercise, the astronomical interferometric delay model was adapted in an 
ad hoc fashion. Calculations based on Eq. (17.21) indicate that for a radius greater 
than 0.5 m, an object could plausibly be expected to be detected up to altitudes 
of approximately 1,000 km. The large collecting area of the MWA is helpful for 
such observations. At the FM-band frequencies, the field of view of the MWA is 
~ 2,400 sq. deg., and the beamwidth is ~ 6 arcmin. It is estimated that on average, 
~ 50 meter-size pieces of debris will be present within the MWA field of view at 
any time. Most will be at distances between the near-field and far-field distances for 
meter-wavelength observation. 

A related application of radio interferometry is the near-field three-dimensional 
positioning of active satellites with VLBI arrays, which is described in Sect. 9.11. 


17.5 Earth Remote Sensing by Interferometry 


Global radio measurements have been made of the Earth since the beginning of 
the satellite era. For these measurements, the basic principle is that, in the absence 
of radiative transfer effects, the brightness temperature is related to the physical 
temperature of the surface through the emissivity e, 


Ts = eT. (17.22) 


Since the emissivity of a material is related to its dielectric constant, the properties 
of the Earth’s surface—e.g., moisture content of soil, salinity of sea water, and the 
structure of ice in the polar regions—can be deduced from maps of Tg. To obtain 
sufficient resolution at radio wavelengths, relatively large apertures are needed. In 
2009, the European Space Agency launched the Soil Moisture and Ocean Salinity 
(SMOS) mission (McMullan et al. 2008; Kerr et al. 2010). This instrument strongly 
resembles a miniature version of the VLA. It has 69 antennas in a Y configuration 
(see Fig. 17.6). The system operates in the protected band of 1420-1427 MHz, 
which turns out to be an excellent frequency range to determine soil parameters 
and ocean salinity. The arm lengths are about 4 m, and the satellite is in a circular 
orbit with a height of 758 km and a period of 1.7 h. The maximum resolution is 
~ 2.6°, corresponding to a linear resolution of about 35 km. The instantaneous field 
of view is about 1100 km. The (u, v) plane is sampled every 1.2 s. Most points on the 
Earth are revisited every three days. The theory of image formulation is a modified 
version of that presented in this book (Anterrieu 2004; Corbella et al. 2004). 
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Fig. 17.6 Artist’s conception of the SMOS satellite, a downward-looking interferometric array 
operating at 21-cm wavelength and imaging the Earth at a resolution of 35 km. The length of each 
of the three arms of the array is 4 m, and the array is tilted by 32° to the tangent plane of the Earth 
below it. Image courtesy of and © European Space Agency. 


The recovery of soil properties from the brightness temperature is a complex 
endeavor. The first step in the recovery process is based on a dielectric mixing model 
(Dobson et al. 1985) and the Fresnel reflection laws giving the relation between the 
emissivity and the dielectric constant of a surface material. A plane wave in free 
space incident on a flat surface with a dielectric constant € at an incidence angle a 
will have power reflection coefficients of 


Ecos — vV € — sinf a 


ry = m 
l ecosa + ve -— sin? g 
(17.23) 
2 
cosa — vV e — sin a 
n= 5 


cosa + ve -— sin g 


where ry is for the electric vector component in the plane of propagation and r1 is 
for the component normal to the plane. These are the Fresnel reflection coefficients 
(note that the index of refraction is ./e). The emissivity, as a function of incidence 
angle, is 


e(a) =1—r(a). (17.24) 
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The emissivity goes to unity for r at the Brewster angle ap,” given by 
tanag = Je. (17.25) 


In the case of normal incidence (œ = 0), the emissivity is 


(17.26) 


The values of € for various types of soils and water saturation vary from about 2 to 
50, corresponding to a range of e, from 0.5 to 0.97 and a brightness temperature 
range, for a nominal surface temperature of 280 K, of 140 to 270 K.* 

The actual retrieval of soil moisture requires careful modeling of the surface 
temperature, subsurface temperature gradient, surface roughness, and radiative 
transfer through the vegetation layer using a physical based algorithm (Kerr et al. 
2012) or statistical methods such as neural networks (Rodriguez-Fernandez et al. 
2015). An example of a soil moisture map is shown in Fig. 17.7. The nominal 
accuracy for the technique is 4% in volumetric moisture content. The dielectric 
constant of sea water is ~ 80, so the ocean brightness temperature is usually below 
100K. Retrieval of ocean salinity is challenging, and even the reflection of the 
galactic emission on the ocean surface needs to be taken into account for accurate 
determinations (Font et al. 2012). 


17.6 Optical Interferometry 


The principles of optical interferometry are essentially identical to those at radio 
frequencies, but accurate measurements are more difficult to make at optical 
wavelengths. One difficulty arises because irregularities in the atmosphere introduce 
variations in the effective path length that are large compared with the wavelength 
and thus cause the phase to vary irregularly by many rotations. Also, obtaining the 
mechanical stability of an instrument required to obtain fringes at a wavelength 
of order 0.5um presents a formidable problem. However, the practicality of 
synthesis imaging in the optical spectrum has been demonstrated using phase 
closure techniques, see, e.g., Haniff et al. (1987) and Baldwin et al. (1996). In 
the absence of visibility phase, the amplitude data can be interpreted in terms of 
the autocorrelation of the intensity distribution, as explained in Sect. 11.3.3, or 


Clark and Kuz’min (1965) used the Owens Valley interferometer to make the first passive 
measurement of the dielectric constant of the surface of Venus (e = 2.2 + 0.2) by effectively 
measuring the Brewster angle. 


3The dielectric constant of sea water is ~ 80, so the ocean brightness temperature is usually below 
100 K. 
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Fig. 17.7 Observations of the Earth made with the European Space Agency’s SMOS orbiting 
synthesis array from data taken on 1 July 2015, overlaid on a visual image. (left) A “swath” map 
constructed from many snapshot images of brightness temperature in one polarization at a fixed 
incidence angle of 42.5°. The range of the color bar is 180-290 K. (right) Reconstructed map of 
the soil moisture based on complex retrieval algorithms and observations of each location with 
multiple angles of incidence and two polarizations. The range of the color bar is 0-0.5 m3/m? 
(fractional volume). The brown shading indicates areas where the soil moisture value was not 
accurately retrieved. Images courtesy of Nemesio Rodriguez-Fernandez and Arnaud Mialon. 


in terms of models of the intensity distribution. Techniques for two-dimensional 
reconstruction without phase data [see, e.g., Bates (1984)] are also applicable. 
Optical interferometry is an active and growing field, and here we attempt only 
to give an overview of some basic principles. See Further Reading at the end of this 
chapter for a collection of important publications in optical interferometry. 

Before discussing instruments, we briefly review some relevant atmospheric 
parameters. The irregularities in the atmosphere give rise to random variations in the 
refractive index over a large range of linear scales. For any particular wavelength, 
there exists a scale size over which portions of a wavefront remain substantially 
plane compared with the wavelength, that is, atmospheric phase variations are small 
compared with 27. This scale size is represented by a parameter, the Fried length dy 
(Fried 1966); see the discussion following Eq. (13.102). The Fried length is equal to 
3.2do, where do is the spacing between paths through the atmosphere for which 
the rms phase difference is one radian; see Eq. (13.102). Regions for which the 
uniformity of the phase path lies within this range are sometimes referred to as 
seeing cells. The scale size dy and the height at which the dominant irregularities 
occur define an isoplanatic angle (or isoplanatic patch) size, that is, an angular range 
of the sky within which the incoming wavefronts from different points encounter 
similar phase shifts. Within an isoplanatic patch, the point-spread function remains 
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Table 17.1 Atmospheric and instrumental parameters at visible and infrared wavelengths 


Isoplanatic Angle Resolution of Atmospheric 
Wavelength (um) dy (m) at Zenith l-m-diameter aperture resolution (A/d;) 
0.5 (visible) 0.14 5.5” 0.13” 0.70” 
2.2 (near infrared) 0.83 33” 0.55” 0.55” 
20 (far infrared) 11.7 8’ 5.0” 0.35” 


Updated from Woolf (1982). 


constant, so the convolution relationship between source and image holds. Typical 
figures for the 50th percentile value of dy, which scales as 16/5 [see Eq. (13.102)], 
and the isoplanatic angle are given in Table 17.1. Also included for comparison are 
the corresponding values of the diffraction-limited resolution of a telescope of 1-m- 
diameter aperture. Optical interferometers provide a powerful means of studying the 
structure functions of the atmosphere at infrared and optical wavelengths; see, for 
example, Bester et al. (1992) and Davis et al. (1995). Note that techniques involving 
correction of atmospheric distortion of the wavefront by means of the telescope 
hardware are referred to as adaptive optics [see, e.g., Roggemann et al. (1997) and 
Milonni (1999)]. Most large telescopes have adaptive optics systems. Such systems 
are strongly analogous to the techniques of self-calibration and phase referencing in 
radio astronomy. 


17.6.1 Instruments and Their Usage 


The use of interferometry for measurement of the angular sizes of stars was 
suggested by Fizeau (1868), and the earliest attempted measurements of this type 
are those of Stéphan (1874), using a mask with two apertures on the objective lens 
of a telescope. Unfortunately, Stéphan’s telescope was not large enough to resolve 
any of the stars he observed. The first successful measurement of the diameter of 
a star was made by Michelson and Pease (1921) on the supergiant star Betelgeuse, 
as described in Sect. 1.3.2. For this measurement, four plane mirrors were mounted 
on a beam attached to the telescope, so that signals received with a baseline spacing 
of 6 m were reflected into the telescope objective to form fringes. In this type of 
measurement, the whole instrument was carried on the mounting of the telescope, 
which simplified the pointing. However, attempts to use a similar system with an 
increased spacing between the mirrors were generally unsuccessful, because of the 
extreme stability required to maintain the relative positions of the mirrors to an 
accuracy of a few tenths of the optical wavelength. Thus, little progress was made 
in optical interferometry until the development of modern electronics and computer 
control for positioning of instrumental components. This allows longer baselines to 
be used but has become possible only in recent decades. 

Figure 17.8 illustrates some of the basic features of a modern optical interferome- 
ter. The two mirrors S are mounted as siderostats and track the optical source under 
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Fig. 17.8 Basic features of an optical interferometer. The broken line represents the light path 
from a star. From Davis and Tango (1985). With permission from and © W. J. Tango. 


study. The positions of the retroreflectors R are continuously adjusted to equalize 
the lengths of the paths from the source to the combination point B. This delay 
compensation is usually implemented in evacuated tubes because the geometric 
delay of the interferometer largely occurs above the atmosphere. If air delay lines 
were used, a separate mechanism would be needed to compensate for the dispersive 
component of the delay, which is difficult to implement in wide bandwidth systems 
[see, e.g., Benson et al. (1997)]. The siderostats are mounted on stable foundations, 
and the rest of the system is usually mounted on optical benches within a controlled 
environment. The apertures of the interferometer, determined by the mirrors S, are 
made no larger than the Fried length dy. Thus, the wavefront across the mirror 
remains essentially plane, and the effect of the irregularities is to produce a variation 
in the angle of arrival of the wavefront. The variation cannot be tolerated since 
the angles of the beams at the combination point B must be correct to within 1”. 
To mitigate this effect, the polarizing beamsplitter cubes P reflect light to quadrant 
detectors Q, which produce a voltage proportional to any displacement of the angle 
of the light beam. These voltages are then used to control the tilt angles of the 
mirrors T, to compensate for the wavefront variation. A servo loop with bandwidth 
~ 1 kHz is required to follow the fastest atmospheric effects. The filters F define the 
operating wavelength. The two detectors D; and D3 respond to points on the fringe 
pattern spaced by one-quarter of a fringe cycle, and their outputs provide a measure 
of the instantaneous amplitude and phase of the fringes. This method is described, 
for example, by Rogstad (1968), who has also pointed out that with a multielement 
system, the phase information can be utilized by means of closure relationships, as 
introduced in Sect. 10.3. The system in Fig. 17.8 is shown to illustrate some of the 
important features used in modern optical interferometers. In practice, the siderostat 
mirrors may be replaced by large-aperture telescopes, and the paths of the light to 
the point at which the fringes are formed may be considerably more complicated. 
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Optical interferometers can be built with very wide bandwidths, that is, AA/A ~ 
0.1 or possibly more, so the central, or white light, fringe is readily identifiable. If 
such a system is made to operate at two such wide wavelength bands simultaneously, 
the effects of the atmosphere, which is slightly dispersive, can be removed. Ground- 
based optical astrometry with dual-wavelength phase-tracking interferometers can 
yield accurate positions of stars (Colavita et al. 1987, 1999). As examples of earlier 
interferometry, Currie et al. (1974)) made measurements using two apertures on a 
single large telescope, and Labeyrie (1975) obtained the first successful measure- 
ments using two telescopes. For descriptions of later, more complex instruments, 
see, for example, Davis and Tango (1985); Shao et al. (1988); Baldwin et al. (1994); 
Mourard et al. (1994); Armstrong et al. (1998, 2013); Davis et al. (1999a,b); ten 
Brummelaar et al. (2005); and Jankov (2010). 

For use in space where the Earth’s atmosphere is avoided, optical interferometry 
holds great promise. The Space Interferometry Mission (SIM) (Shao 1998; Allen 
and Böker 1998; Böker and Allen 1999) was a space-based interferometer for 
the wavelength band 0.4—1.0 um with variable baseline up to 10 m, intended to 
provide synthesis imaging with a resolution of 10 mas, and to measure fringe 
phases with sufficient accuracy to provide positions of stars to within 4 jas. It was 
never launched. An application of space interferometry to the detection of planets 
around distant stars is discussed by Bracewell and MacPhie (1979). The ratio of the 
signal from the planet to that from the star is maximized by choosing an infrared 
wavelength on the long-wavelength side of 20 um and by placing a fringe-pattern 
null in the direction of the star. A demonstration of the nulling technique using 
ground-based telescopes is described by Hinz et al. (1998). 

Rogstad (1968) describes a technique for measurement of the visibility phase 
using an interferometer in the presence of an atmospheric component of seeing 
(refraction). Consider a linear arrangement of mirrors (i.e., the optical receivers) in 
which a unit spacing occurs twice and all integral multiples of the unit spacing, up to 
a maximum value, occur at least once. The receivers are designed to measure both 
the amplitude and the phase of the visibility function. The phase of the visibility 
for each of these spacings can be derived from the measured phases plus a unit- 
spacing phase component. This unit-spacing phase contains a component due to 
the atmosphere, although the longer spacing values are free from the atmospheric 
effect. However, the unit-spacing phase affects only the position of the resulting 
image, i.e., the coordinates on the sky, and it can be set to zero without affecting the 
form of the image. This method has been implemented on several interferometers 
[e.g., Jorgensen et al. (2012)], where it is referred to as baseline bootstrapping. 

Several optical interferometers have been constructed from large telescopes that 
are used independently part of the time. For example, the Keck Observatory on 
Mauna Kea has two 10-m-diameter telescopes with a spacing of 85 m. As an 
interferometer, these antennas can provide an effective angular resolution of 5 mas 
at 2.2-um wavelength and 24 mas at 10-um wavelength. The European Southern 
Observatory in Chile has constructed the Very Large Telescope Interferometer 
(VLTD, which consists of four 8.2-m-diameter telescopes and four auxiliary 1.8- 
m-diameter telescopes. With current instrumentation [Petrov et al. (2007) and 
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Le Bouquin et al. (2011)], up to six baselines can be correlated at once, providing 
multiple phase closures and imaging capability. Operating in the bands between 1.5- 
and 2.4-um, resolution as fine as 2 mas can be achieved on baselines up to 130 m. 
Spectral line capability is also provided with A/AA = 12000 (velocity resolution 
of 25kms7!). 

In the systems mentioned above, the fringes are formed by combining the 
incoming radiation at the same wavelength as it is received, as in the classical 
Michelson stellar interferometer. They are therefore also referred to as direct 
detection systems. A disadvantage of arrays built in this way is that the light 
cannot be divided, with loss in SNR. An alternative to the direct detection system 
is the heterodyne system, in which the light from each aperture is mixed with 
coherent light from a central laser to produce an intermediate frequency (IF). 
The IF waveforms are then amplified and correlated in an electronic system, in a 
manner basically identical to that used in radio interferometry. In comparison with 
a direct detection system, the sensitivity is greatly limited by the quantum effects 
mentioned in Sect. 1.4. It is also limited by the bandwidth that can be handled by 
the electronic amplifiers, unless the mixer outputs are split into many frequency 
channels, each of which is processed in parallel. A large bandwidth can then be 
processed using a correspondingly large number of amplifiers and correlators. The 
bandwidth division also has the effect of increasing the path length difference over 
which the signals remain coherent. The heterodyne technique has been used in 
infrared interferometry; see, for example, Johnson et al. (1974), Assus et al. (1979), 
and Bester et al. (1990). Possible application to large multielement telescopes with 
multiband processing in the infrared and visible ranges has been discussed by 
Swenson et al. (1986). 

From the submillimeter radio range to the optical is a factor of ~ 10° in 
wavelength, and a further factor of ~10° takes one to the X-ray region. X- 
ray astronomy could benefit greatly by the potentially high angular resolution 
obtainable through interferometry. The viability of X-ray interferometry, suitable 
for astronomical imaging, has been demonstrated in the laboratory by Cash et al. 
(2000). It holds the promise of providing extremely high angular resolution in 
observations above the atmosphere. At a wavelength of 2 nm, a baseline of 1 m 
provides a fringe spacing of 40 jas. In the laboratory instrument, the apertures 
are defined by flat reflecting surfaces, which are used at grazing incidence to 
minimize the requirement for surface accuracy. Direct detection is the only available 
technique, and if the fringes are formed by simply allowing the reflected beams to 
converge on a detector surface, a long distance is required to obtain sufficient fringe 
spacing. With 400-uas angular spacing of the fringes, adjacent maxima would be 
separated by only 1 um at 500-m distance. Astronomical interferometry at X-ray 
wavelengths will be a challenging enterprise. 
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17.6.2 Sensitivity of Direct Detection and Heterodyne Systems 


Factors that determine the sensitivity of optical systems, such as losses due to 
scattering, partial reflection, and absorption, are different from corresponding 
effects at radio wavelengths. However, in heterodyne systems, the most important 
difference is the role of quantum effects. The energy of optical photons is five or 
more orders of magnitude greater than that of radio photons, and quantum effects 
are largely negligible in the radio domain at frequencies lower than ~ 100 GHz. In 
the optical range (wavelength ~ 500 um), the frequency is of order 600 THz, and 
the bandwidth could be as high as 100 THz. In a typical heterodyne system in the 
infrared, the wavelength of 10 um corresponds to 30 THz, and the bandwidth used 
is ~ 3 GHz [see, e.g., Townes et al. (1998)]. 

In direct detection systems, the detector or photon counter does not preserve 
the phase of the signal, and thus the noise resulting from the uncertainty principle, 
discussed in Sect. 1.4, does not occur. The noise is principally shot noise resulting 
from the random arrival times of the signal photons. The number of photons received 
from a source of intensity J is 


_ IRQ AAV 
E hv 


N (photons s7!) , (17.27) 


where £42, is the solid angle of the source (with no atmospheric blurring), A is the 
collecting area of the telescope, Av is the bandwidth, v is the frequency, and h is 


Planck’s constant. If the source is a blackbody at temperature T, the Planck formula 
gives 


Ww l 


Note that for direct detection, we are considering the signal in both polarizations. 
Thus, we have 


_ 22,Aav 1 


N 2 ehy /kT =] 


(photons s7!) . (17.29) 


The received power is 
P=WN. (17.30) 


The fluctuations in the power, APp, are caused by photon shot noise and therefore 
are proportional to /N. Thus, 


APp =hvVN . (17.31) 
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AP» is known as the noise equivalent power. The SNR in one second is P/ APp = 
/N , and therefore for an integration time Ta, the SNR for direct detection is 


2922;A AVTq 1a 
Rsv = 2) mwi ; (17.32) 


where the subscript D indicates direct detection. Note that Rsnp is proportional to 
VA, because of the shot noise, rather than to A, as in the radio case. 

In a heterodyne system, the noise is determined by the uncertainty principle, 
since the mixer is a linear device that preserves phase. The minimum noise is one 
photon per mode (one photon per hertz per second), as noted in the discussion 
following Eq. (1.15). This is equivalent to saying that the system temperature is 
hv/k (see, e.g., Heffner (1962), Caves (1982)]. Hence, in a period of one second, 
the uncertainty in power is 


APy = hvy Av. (17.33) 


The heterodyne detector responds only to the component of the radiation to which 
its polarization is matched, and the received power is half of that in Eq. (17.30). The 
SNR for a heterodyne system (indicated by subscript H) is therefore P/(2APz) in 
one second, and in time Ty, it is 


sA Vv AVTa 
Ron = (5 ) es (17.34) 


2 ehv/kT =] . 


Note that Eq. (17.34) reduces to the usual radio form in Eq. (1.8) when hv/kT < 1. 
In that case, T4 = TQ,A/A? and the minimum value of hv/k can be used for system 
temperature. The ratio of SNRs for the direct detection and heterodyne systems, 
when parameters other than the bandwidth are the same, is 


Rsau 2;A 1 Avy 
=~ — | — | — l]. 17.35 
RenD ( 2A2 ) elv/kT _ 1 (=) ( ) 


As indicated earlier, y Avy/Avp could be as low as ~ 4 x 10-3. However, 
for direct detection, the propagation delays through the different siderostats to 
the fringe-forming point must be maintained constant to ~1/10 of the reciprocal 
bandwidth. This requirement restricts the bandwidths that can practicably be used, 
especially with baselines of hundreds of meters. The heterodyne system offers 
simpler hardware that provides useful sensitivity at 10 zm wavelength and possibly 
to the next atmospheric window at 5 um. It also allows the amplified IF signals 
to be split without loss in sensitivity, to provide multiple simultaneous correlations 
in multielement arrays. Relative advantages of the heterodyne and direct detection 
systems are discussed by Townes and Sutton (1981) and de Graauw and van de Stadt 
(1981). 
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17.6.3 Optical Intensity Interferometer 


The use of the intensity interferometer for optical measurements on stars was 
demonstrated by Hanbury Brown and Twiss (1956a), shortly after the success 
of the radio intensity interferometer described in Sects. 1.3.7 and 17.1. At that 
time, the possibility of coherence between photons in different light rays from the 
same source was questioned, and the physical basis and consistency with quantum 
mechanics is explained by Hanbury Brown and Twiss (1956c) and Purcell (1956). 
The laboratory demonstration of the correlation of intensity fluctuations of light by 
Hanbury Brown and Twiss (1956b) led to the appreciation of the phenomenon of 
photon bunching and to the broader development of quantum statistical studies and 
to their application to particle beams as well as electromagnetic radiation (Henny 
et al. 1999). 

In the optical intensity interferometer, a photomultiplier tube at the focus of 
each telescope mirror replaces the RF and IF stages and the detectors of the radio 
instrument. The photomultiplier outputs are amplified and fed to the inputs of the 
correlator. The optical intensity interferometer is largely insensitive to atmospheric 
phase fluctuations, as explained for the radio case in Sect. 17.1. The size of the light- 
gathering apertures is therefore unrestricted by the scale size of the irregularities. 
Also, it is not necessary that the reflecting mirrors produce a diffraction-limited 
image, and their accuracy need only be sufficient to deliver all the light to the 
photomultiplier cathodes. This is fortunate since the low sensitivity mentioned 
earlier for the radio case necessitates the use of large light-gathering areas. Hanbury 
Brown (1974) gave an analysis of the response of the optical instrument and showed 
that it is proportional to the square of the visibility modulus as in the radio case. 
Either a correlator or a photon coincidence counter can be used to combine the 
photomultiplier outputs. 

The intensity interferometer constructed at Narrabri, Australia (Hanbury Brown 
et al. 1967; Hanbury Brown 1974), used two 6.5-m-diameter reflectors and a 
bandwidth of 60 MHz for the signals at the correlator inputs. The resulting 
limiting magnitude of +2.5 enabled measurements of 32 stars to be made. Davis 
(1976) has discussed the relative merits of the intensity interferometer and modern 
implementations of the Michelson interferometer for development of more sensitive 
instruments. 


17.6.4 Speckle Imaging 


The image of an unresolved point source observed with a telescope of which the 
width of the aperture is large compared with the Fried length dy depends on the 
exposure time over which the image is averaged. An exposure no longer than 10 ms 
shows a group of bright speckles, each of which is the approximate size of the 
Airy disk (i.e., the diffraction-limited point-source image) of the telescope. If the 
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exposure is much longer, the pattern is blurred into a single patch (the “seeing” 
disk) of typical diameter 1”, determined by the atmosphere. The characteristic 
fluctuation time of 10 ms in the optical range corresponds to the time taken for 
an atmospheric cell of size dy ~ 0.14 m to move past any point in the telescope 
aperture at a typical wind speed of 10-20 m s~!. The use of sequences of short- 
exposure images to obtain information at the diffraction limit of a large telescope 
is known as speckle imaging. Speckle patterns reflect the random distribution of 
atmospheric irregularities over the aperture and differ from one exposure to the next 
on the 10-ms timescale. Reduction of many exposures is required to observe faint 
objects by this technique. 

For the theory of the speckle response, see, for example, Dainty (1973), 
Bates (1982), or Goodman (1985). Here we note that the high-resolution image 
represented by a single speckle can be understood if one considers each speckle as 
resulting from several seeing cells of the wavefront, located at points distributed 
across the telescope aperture. These cells are the ones that present approximately 
equal phase shifts in the ray paths from the wavefront to the speckle image (Worden 
1977). Then, by analogy with an array of antennas, the resolution corresponds to the 
maximum spacing of the cells, that is, it is of the order A /d, where d is the telescope 
aperture. Aberrations in the reflector do not significantly degrade the speckle pattern 
as long as the dominant phase irregularities are those of the atmosphere. The area 
of the image over which the speckles are spread corresponds to A/dy on the sky 
and becomes the seeing disk in a long exposure. The seeing cells can be regarded 
as subapertures within the main telescope aperture, the responses of which combine 
with random phases in the image. The number of speckles is of the order of the 
number of subapertures, that is, (d/dy)”. With a large telescope (d ~ 1 m), this 
number is of the order of 50 at optical wavelengths. Also, the size of the seeing cells 
increases with wavelength, and in the infrared, only a few speckles appear in the 
image. 

A rather simple image restoration technique called the “shift-and-add” algorithm 
can be applied to speckle images (Christou 1991). It works best when there is a 
point source in the field, and at infrared wavelengths where there are relatively few 
speckles per frame and the isoplanatic patch is relatively large (see Table 17.1). The 
short exposure speckle frames are aligned on their brightest speckles and summed. 
The point-spread function (“dirty beam”), which can be obtained from the image of 
a point source within the field, will have a diffraction-limited component and a much 
broader component composed of the fainter speckles. This step can be followed by 
other restoration algorithms such as CLEAN (see Sect. 11.1) to improve the image 
quality further [see, e.g., Eckart et al. (1994)]. 

When the shift-and-add algorithm is not applicable, the modulus of the visibility 
can be obtained by the technique of speckle interferometry, which originated with 
Labeyrie (1970). This procedure can be understood from the following simplified 
discussion. On a single image of short exposure, a number of approximately 
diffraction-limited speckles appear at random locations within the seeing disk. The 
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speckle image /,(/,m) can be described as the convolution of the actual intensity 
distribution /(/, m) with the speckle point-spread function P(/, m). Thus, 


I,(1,m) = I(l, m) * * P(I,m) . (17.36) 


The function P(/,m) is a random function that cannot be specified exactly. As a 
first approximation, we will assume that P(/, m) is the point-spread function of the 
telescope in the absence of atmospheric effects, bo(/, m), replicated at the position 
of each speckle. Thus, we can write 


P(l,m) = È` boll — li,m — mi) , (17.37) 


where l; and m; are the locations of the speckles, all of which are assumed to have 
the same intensity. From Eqs. (17.36) and (17.37), we obtain 


I(l, m) = >O, m) * * boll — li,m — mì) . (17.38) 


If the Fourier transform of bọ (l, m) is bo (u, v), then the Fourier transform of bo (l — l;, 
m — mi) is bo(u, v) exp[j27 (ul; + vm;)]. Hence, the Fourier transform of Eq. (17.38) 
can be written as 


T,(u, v) = X Viu, v) bo(u, v) eP?7 Uit vm) (17.39) 


where V and T are the Fourier transforms of J and J,, respectively. The speckle 

transforms 7, cannot be summed directly because of random phase factors in 
a = 

Eq. (17.39). To eliminate these phase factors, we calculate |/,|? G.e., Ts T, ), which is 


\7,(u, v) | = 5 > IV(u, v)|7| Dolu, v)|? eP uli—)+ vni —m)] 
i k 


= |V (u, v)|’| Bolu, v)|? | N + X eP Homm]  (17.40) 
iŻk 


where N is the number of speckles. Since the expectation of the summation term in 
the second line of Eq. (17.40) is zero, the expectation of Eq. (17.40) is 


(| Ts(u, v)/?) = Nol V (u, v)|’| Bolu, v)|? , (17.41) 


where No is the average number of speckles. Hence, the average of a series of 
measurements of |I, (u, v)|*, estimated from short exposures, is proportional to the 
squared modulus of V(u, v) times the squared modulus of bo(u, v). Since bo(u, v) 
is nonzero for |u| and |v| < D/A, the function |V(u, v)|* can be determined over 
the same range of u and v, if bo(u, v) is known. In practice, the speckles cannot be 
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accurately modeled by Eq. (17.37). However, we can write 
(| Tsu, V?) = |V Cu, v)/? (|P(u, v)/?) , (17.42) 


where P (u, v) is the Fourier transform of P(l, m). From Eqs. (17.41) and (17.42), 
(| P(u, v)|?) should be approximately proportional to |bo(u, v)|?. It can be estimated 
by observing a point source under the same conditions as those for the source under 
study. 

The phase information can be extracted from the speckle frames but with con- 
siderably more computational effort. Most phase-retrieval algorithms are variations 
of two basic methods: the Knox—Thompson, or cross-spectral, method (Knox and 
Thompson 1974; Knox 1976) and the bispectrum method (Lohmann et al. 1983). 
These methods are described in detail by Roggemann et al. (1997). 
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Hilbert transform, 104, 106, 215, 229, 352 
Hinge point, 704 
Hipparcos satellite, 2 
star catalog, 602, 603 
Historical development, 13—44 
analog Fourier transformation, 544 
imaging from one-dimensional profiles, 
543 
receivers, 221, 256 
VLBI, 37-42, 391-394 
Holes in spatial frequency coverage, 173 
Holography. See Antenna(s), measurements, 
holographic 
Hour angle, 109-113, 146 
Hubble constant, 543 
Hybrid correlator, 366 
Hybrid mapping, 40, 563 
Hydrogen line, 6, 31, 442-446 
Hydrostatic equilibrium, 661 


IAU 
polarization standard, 124, 125 
radio source nomenclature, 10 
ICRF, 11, 40, 602 
ICRS, 602 
IEEE 
committee on frequency stability, 426 
polarization standard, 124 
power flux density, 6 
IF. See Intermediate frequency 
Illumination, aperture. See Antenna(s), 
aperture illumination 
Image defects. See also Phase, noise 
correlator offset, 533 
distortion, 533 
errors in visibility data, 533 
ghost, 530-532 
Incoherence assumption (spatial), 90, 774, 
810 
Incoherent averaging, 415—419, 425 
Incoherent source, response to, 777—779 
Inertial reference frame, 602 
Infrared interferometry 
detection of planets, 831 
heterodyne detection, 834 
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Instrumental (compensation) delay. See Delay, 
instrumental 
Instrumental polarization, 129-133, 
137-142 
degrees of freedom, 139 
Intensity, 9, 530 
derivation, 490, 504 
interpretation, 530 
scale, 530, 564 
Intensity interferometer, 24—25, 809-814 
optical, 835 
sensitivity of, 419 
Interference, radio 
satellites, 804-805 
(u, v) plane distribution, 797—800 
VLBI, 801-803 
Interferometer 
adding (simple), 19, 20 
basic components, 99—102 
compound, 30 
correlator, 22 
infrared. See Infrared interferometry 
intensity. See Intensity interferometer 
Michelson, 14—18 
optical (modern Michelson), 827-835 
sea, 20 
spectral-line, 31 
Intermediate frequency (IF), 208 
amplifier, 262 
subsystem, 261-262 
International Astronomical Union. See IAU 
International Atomic Time (IAT), 619 
International Celestial Reference Frame, 11, 
40, 602 
International Reference Ionosphere (IRI), 
734 
International Telecommunication Union (ITU), 
793 
Radiocommunication Sector, 795 
International VLBI Service (IVS), 40 
Interplanetary medium 
excess path length, 747 
refraction, 744—748 
scintillation, 748-750 
scintillation index, 749 
Interpolation, 159, 496-497. See also Gridding 
(convolutional) 
Interstellar masers, 755 
Interstellar medium, 750-758 
dispersion measure, 751 
electron density, 751 
Faraday rotation, 751-753 
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pulsar signals, effects on, 751 
scattering 
diffractive, 753-758 
Fiedler, 757 
refractive, 756-758 
Invisible distribution, 490 
Ionosphere 
absorption, 725-735 
achromatic spherical lens, 731 
acoustic-gravity waves, 736 
effects of irregularities, 735-737 
Faraday rotation, 726, 730 
Gaussian screen model, 737—742 
index of refraction, 728-730 
phase stability, 726 
power-law model, 742-744 
propagation delay, 730-742 
refraction, 730-733 
scintillation, 736, 737 
total electron content, 726, 732, 737 
traveling ionospheric disturbances (TIDs), 
736 
Isoplanatic 
angle, neutral atmosphere, 828 
patch 
ionosphere, 509, 587, 726 
neutral atmosphere, 828 
ITU, 805 
ITU-R. See International Telecommunication 
Union, Radiocommunication Sector 


J? synthesis (J-squared synthesis), 187 

Jansky (unit), 6 

Jinc function, 492 

Jodrell Bank Observatory, England, 24, 26, 34, 
187, 391 

Johnson noise, 4 

Jones matrix, 134 

Julian year, 617 

Jupiter, 391 


Kolmogorov turbulence, 685, 690, 743-744, 
754 

Kramers—Kronig relation, 674, 678 

Kronecker delta function, 380 


Leakage (polarization), 130, 146-147 
Leakage (sampling), 158 
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Leap second, 619 
Least-mean-squares analysis, 636-648 
accuracy, 645 
correlated measurements, 643 
covariance matrix, 643 
design matrix, 645 
error ellipse, 644, 648 
estimation of delay, 641 
estimation of fringe frequency, 641 
likelihood function, 636 
matrix formulation, 643 
nonlinear case, 646 
normal equation matrix, 643, 644, 647 
partial derivative matrix, 643 
precision, 645 
self-calibration, 566 
sinusoid fitting, 236 
variance matrix, 643 
weighted, 637 
Length of day, 620 
Lensclean, 589 
Light, speed (velocity) of, 60, 600 
Likelihood function, 636 
Line of nodes, 617. See also Equinox 
Linear arrays, 173—178 
Lines, radio. See Spectral line(s) 
Lloyd’s mirror, 20 
LO. See Local oscillator 
Local oscillator, 3, 46, 207—209, 261. See also 
Frequency standards 
independent, 37. See also VLBI 
laser, 832 
multipication, stability, 446 
nonsynchronized, 813 
phase switching of, 298 
synchronization of, 264—277 
Local standard of rest, 541 
Long-baseline interferometer, 391 
Loran, 447 
Lorentz equation, 728 
Lorentz factor, 542 
Lorentzian profile, 676, 678, 684 
Low-frequency imaging, 587-589 
Low-noise input stage, 221, 256 
Lunar occultation 
optical, 814, 818 
radio, 24, 601, 749, 814-819 
Lutz—Kelker effect, 624 


Magellanic Cloud, Small, 575 
Magnetic fields 


in frequency standards, 442, 443, 
446 
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interstellar, 750 
terrestrial, 729 
Magnetic tape recording, 37, 450 
Magnitude of visibility, 72n 
Mapping 
two-dimensional, 73 
visibility amplitude only, 569 
wide-field, 95-98, 245-246, 570-577 
Markov chain Monte Carlo (MCMC) 
algorithm, 513, 624, 646 
Maryland Point Observatory, Maryland, 417 
Maser frequency standard, 442-446 
Maser radio sources, 6, 38, 391, 393, 417 
mapping procedures, 631-636 
scattering, 755 
spatial coherence, 785 
Master oscillator, 261 
Mauna Kea, Hawaii, 37, 187, 693, 695 
Mauritius Radio Telescope, 186 
Maximum entropy method (MEM), 557-559, 
590 
Maximum-likelihood method, 464, 636, 
648 
Maxwell’s relation, 676 
MeerKAT, 199 
Meridian, 109 
Greenwich, 109, 406, 607, 618, 620 
local, 109, 600, 620 
plane, 109, 604 
transit (crossing), 620 
MERLIN, 26, 187, 566 
Meter, definition, 600 
Michelson interferometer, 13—18 
Microwave link. See Radio link 
Mie scattering, 824 
Millibar, 659 
Millimeter-wavelength arrays, 37 
Mills cross, 26-28, 168-172 
Minimum redundancy. See Arrays, minimum 
redundancy; Bandwidth, synthesis 
Mirror-image reception pattern, 67 
Misell algorithm, 823 
Mixer, 208-209. See also Frequency, 
conversion 
sideband-separating, 301 
MKSA units, 675n 
Model 
adaptive calibration, 566 
circular disk, 17 
Cygnus A, 25 
delta function (CLEAN), 552 
fitting, 510-520 
Gaussian, 17, 32 
rectangular, 17 
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Modulated reflector, 265 
Molonglo, Australia, 28 
Moment property, 81 
Moon, 44. See also Lunar occultation; 
Precession 
Mosaicking (mosaic imaging), 570-577, 692 
arrays for, 570-577 
linear, 574 
nonlinear, 574—575 
on-the-fly, 577 
Mueller matrix, 136 
Mullard Radio Astronomy Observatory. See 
Cambridge, England 
Multifrequency synthesis, 578 
Multiplier (voltage), 22, 210. See also 
Correlator 
Mutual coherence function, 767—771 


Nangay Observatory, 30, 31 
Narrabri, Australia, 835 
National Aeronautics and Space 
Administration (NASA), 3, 
40, 453 
Extragalactic Database (NED), 5, 11 
National Geodetic Survey (NGS), 3 
National Radio Astronomy Observatory 
(NRAO), 34, 38, 180, 392, 453, 568. 
See also ALMA; Green Bank, West 
Virginia; Very Large Array; Very 
Long Baseline Array 
Natural weighting, 232, 491, 494 
Naval Observatory, U.S. (USNO), 3, 619 
Naval Research Laboratory (NRL), 3, 417 
NAVSTAR. See GPS 
Near-field observations, 473, 775 
Negative frequencies, 64, 70, 105, 106 
Network Users Group (US), 38-41 
Neutral atmosphere 
opacity, 693-696 
phase stability, 697-701 
Nobel lecture, Ryle, 35 
Nobeyama Radio Observatory (NRO), Japan, 
37 
Noise. See also Signal-to-noise ratio 
amplitude and phase, 233-235 
equivalent power (NEP), 834 
in complex visibility, 228—230, 236 
in image, 230-233 
in oscillators 
flicker-frequency, 433—434 
flicker-phase, 433—434 
random-walk-of-frequency, 433—434 
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white-frequency, 433—434 
white-phase, 433—434 
in VLBI, 407—412 
photon shot noise, 441, 445, 834 
power, 11, 257 
quantum effect, 44—45 
response to, 223-235 
temperature measurement, 257—260 
Noncoplanar baselines, 97—98, 579-585 
3D Fourier transform, 581—582 
polyhedron mapping, 582 
snapshot combination, 582 
variable point-source response, 583 
Nonnegative, least-squares, 562 
North Liberty, Iowa, 38 
NRAO. See National Radio Astronomy 
Observatory 
NRAO VLA Sky Survey (NVSS), 11 
Nuffield Radio Astronomy Laboratories. See 
Jodrell Bank Observatory, England 
Nutation, 2, 10, 616-617 
NVSS. See NRAO VLA Sky Survey 
Nyquist power theorem, 12 
Nyquist rate (frequency), 312-313. See also 
Sampling theorem 
Nyquist sampling theorem, 46 


Observation, planning, and reduction, 534-535 
Occultation observations. See Lunar 
occultation; Precession 

On-the-fly mosaicking, 577 
Opacity, 670-672 

measurement of, 672-673 
Optical depth. See Opacity 
Optical fiber, 262-264, 273-274 

dispersion, 263, 302-303 

high stability, 274 
Optical interferometry, 45, 827-835 

direct and heterodyne detection, 832-834 
Orbiting VLBI. See OVLBI 
Oscillator coherence time, 434—436 
Oscillator strength, 675 
Outer product, 135, 293 
Overlap processing, 360 
Overlapping segment averaging, 360 
Oversampling, 313, 315-316, 325, 328, 347 
OVLBI, 42-44, 188-191, 470-476 

data link, 471-472 

round-trip phase, 471 

timing link, 471-472 
Owens Valley Radio Observatory, California, 

30, 32, 37, 38, 706 
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Parabolic main reflector, 155 
Parabolic-cylinder reflector, 153 
Parallactic angle, 121, 128, 139, 141 
Parallax, 617, 623 
Parametric amplifier, degenerate, 221 
Parametrized Ionospheric Model (PIM), 734 
Parseval’s theorem, 84, 376, 381, 685, 799, 803 
Partial coherence, 778 
Passband 
Gaussian, 64, 244 
rectangular, 64, 243, 278 
tolerances, 279-282 
passive radar, 824 
Peeling, 587 
Pencil beam, 27, 177 
Permittivity, 675n 
Peryton, 790 
Phase 
errors, effect on sensitivity, 277 
noise 
effects on maps, 570, 633 
in frequency multipliers, 446 
in frequency standards, 415, 425-434 
in receivers, 233—235, 407-408 
neutral atmospheric, 680-692, 813 
Phase center shifting, 398 
Phase closure, 25—26, 40, 393, 489, 505-510 
Phase data 
imaging without, 569 
uncalibrated, 563-569 
Phase fluctuations, 680-690 
Phase reference 
feature, 632-636 
position, 73, 89, 93, 110 
Phase referencing 
atmospheric effects, 709-711 
for masers, 632 
in VLBI, 610-616 
Phase stability 
analysis of, 425-436 
in reference distribution, 264—274 
of filters, 276-277 
of frequency standards, 436-440, 446 
of local oscillators, 446 
Phase switching, 30, 290-298, 348, 349 
in Mills cross, 27 
in simple interferometers, 21—23, 29 
interaction with fringe rotation and delay, 
298 
Phase tracking center. See also Phase reference, 
position 
Phase-locked oscillator, 267, 274-276, 436 
loop natural frequency, 274 
Phase-tracking center, 89 
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Phased array, 162-164, 187, 365n, 
466-470 
as VLBI element, 466-470 
correlator array, comparison with, 
162-164 
randomly phased, 466 
Photo bunching noise, 49 
Pico de Veleta, Spain, 821 
Planar arrays, 192-193 
Plancherel’s theorem, 84n 
Planck formula, 9, 12, 259-264, 833 
Planck mission, 535, 539 
Planetary nebula, 5, 511 
Planets, 557, 601, 616. See also Burst 
radiation, Jupiter 
as calibration sources, 537—538 
Plasma. See also Interplanetary medium; 
Interstellar medium; Ionosphere 
absorption in, 735 
frequency, 728 
index of refraction, 729-730 
oscillations, 121 
propagation in, 727-758 
RF discharge, 440, 442 
turbulence, 742—744 
Plateau de Bure, France, 37 
Pleiades, 626 
Point-source response, 65, 165, 171, 551. 
See also Beam, synthesized (dirty) 
Point-spread function, 836. See also 
Point-source response 
Pointing correction, 486 
Poisson distribution, 45 
Polar motion, 2—3, 617, 618 
measurement of, 620-621 
Polarimetry, 121—142 
Polarization 
calibration, 137—142 
circular, 122, 123, 128, 129, 140, 141 
complex degree of, 752-753 
cross-polarized, 127—129 
degree of, 122 
design considerations, 141, 142 
ellipse, 123-125 
emission processes, 4, 121 
identically polarized, 126-127 
instrumental, 129-133 
linear, 122, 127-129, 141 
matrix formulation, 134-137 
mismatch tolerance, 289 
parallactic angle effect, 128 
position angle, 122. See also Faraday 
rotation 
Polyphase filter banks, 369-373 
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Position measurements 
early, 24 
methods. See Astrometry 
Power (density) spectrum, 63, 98 
atmospheric phase, 686-690 
correlator output, 227 
interplanetary scintillation, 748-750 
phase and frequency fluctuations, 
425—434 
Power combiner, 162, 164 
Power flux density, 6 
Power reception pattern. See Antenna(s), 
reception pattern 
Power-law antenna spacing, 180, 182 
Power-law turbulence relations, 690 
Poynting vector, 6 
Precession, 2, 10, 616—617 
Price’s theorem, 322, 327 
Principal response, 494, 591 
Principal solution. See Principal response 
Probability 
of error, 412—415 
of misidentification, 415 
Probability density function, 741 
Probability distribution 
bivariate Gaussian, 311-312, 644, 741 
Gaussian, 156, 224, 311, 332, 407, 411, 
434 
Rayleigh, 234, 407, 410, 414 
Rice, 234, 408 
Projection-slice theorem, 74-75 
Prolate spheroidal wave functions, 501 
Propagation 
constant, 405, 659 
interplanetary, 744-750 
interstellar, 750-758 
ionospheric, 725-737 
neutral atmosphere, 658-705 
Proper motion, 10, 617, 623—627 
Pulsar, 465 
astrometry, 602 
correlator gating, 365 
determination of vernal equinox, 601 
dispersion measure, 745, 751 
proper motions, 753 
scintillation, 756 
spatial coherence, 784 
timing accuracy, 439, 448 
Pulse calibration (VLBI), 447 


Q-factor of 
cavity, 444-446 
filter, 276 
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Quadrature 
network, 215, 348, 355 
phase shift, 222, 229, 298, 352 
Quadruple moment theorem. See Fourth-order 
moment relation 
Quadrupod, 157, 822 
Quantization 
comparison of schemes, 346 
correction, 365, 377—378 
efficiency factor, 228, 328-336, 453, 461 
four-level, 320-326, 377-378 
in VLBI systems, 453, 460 
indecision regions, 350-351 
noise, 309 
repeated (requantization), 368, 470 
three-level, 332-336 
thresholds, 319-320, 327 
two-level, 316-320 
Quantum noise, 44, 833 
Quantum paradox, 44 
Quasar, 4, 38, 41, 393, 567. See also Radio 
source 


Rademacher functions, 291n 
Radial smearing. See Bandwidth, effect in 
maps 
Radiative transfer equation, 671 
Radio interference 
adaptive cancellation, 790 
adaptive nulling, 793 
airborne and space transmitters, 
804-805 
decorrelation, 800—801 
deterministic nulling, 791, 792 
fringe-frequency averaging, 796-800 
threshold pfd and spfd 
short and intermediate baselines, 
796-801 
total power systems, 794 
VLBI, 801-803 
Radio lines. See Spectral line(s) 
Radio link, 24, 26, 261 
Radio source 
0134+329, 11 
0748+240, 691 
1548+115, 567 
1622+633, 190 
1638+398, 612 
1641+399, 612 
3C138, 140 
3C147, 489 
3C224.1, 554 
3C273, 39, 601, 628, 630, 631, 706, 748 
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3C279, 628-631, 784 
3C286, 140, 489 
3C295, 489 
3C33.1, 32 
3C48, 4,5, 10, 489 
Algol, 601 
Cassiopeia A, 4, 5, 19, 23-25, 34 
Centaurus A. See NGC5128 
Crab Nebula, 24, 745 
Cygnus A, 4,5, 10, 11, 19, 20, 24-26, 34, 
34, 37, 730 
central component (VLBI), 36 
fringe pattern, 19, 21 
map or image, 26, 34-36, 568 
IM Peg, 625 
J1745-283, 627 
Jupiter, 37 
M82, 4,5 
M87. See NGC4486 
MG J0751+2716, 41 
MWC349A, 5, 6 
NGC4258, 41, 542 
NGC4486, 24, 559, 593 
NGCS5128, 24 
NGC6334B, 755 
NGC7027, 5, 10, 489 
Orion Nebula, 6, 8, 625, 626 
Orion Nebula Cluster (ONC), 626 
Orion water-line maser, 821 
Pleiades, 626 
PSR 1237+25, 784 
Sagittarius A*, 40, 626-627, 755, 757 
Sun, 20, 28, 30, 187, 543 
Taurus A. See Crab Nebula 
TW Hydrae, 4, 5 
Venus, 5, 6, 310, 827 
Virgo A. See NGC4486 
W3 (OH), 417 
W49, 631 
Radio source nomenclature, 10 
Radio spectrum, regulation of, 805 
Radiometer equation, 46—49 
Radiosonde data, 674, 693, 703 
Raised cosine weighting. See Hann weighting 
(smoothing) 
Rayleigh distribution, 234, 407 
Rayleigh—Jeans approximation, 9, 12, 257, 
258, 260. See also Planck formula 
Rayleigh scattering, 824 
Rayleigh theorem, 84n 
Receiver 
electronics, 255—264 
phase switching, 21-23, 27, 29, 290-298. 
See also Phase switching 
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temperature, 12, 46, 257 
forcascaded components, 258 
Reception pattern. See Antenna(s), reception 
pattern; Voltage reception (response) 
pattern 
Recording systems (VLBI), 448—450 
Rectangular function, 78 
Redundancy measure, 174 
Reference frames. See ICRF; ICRS 
Reflections 
in cable, 267, 270, 279 
in optical fiber, 263 
Reflector, modulated, 265 
Refraction 
anomalous, 692 
in neutral atmosphere, 658, 663—669 
in plane-parallel atmosphere, 663 
index of, 659 
interplanetary, 744-748 
ionospheric, 728-733 
optical, 657 
origin of, 674-679 
spherically symmetric, 666, 
744-748 
Refractivity. See also Refraction, index of 
optical, 680 
Relative sensitivity of systems, 235-239 
Relativistic effect 
general relativistic bending, 628, 748 
Lorentz factor, 542 
time transfer effects, 447 
Resolution 
atmospheric limitation of, 680-692 
MEM, 559 
Restoration from samples. See Sampling 
theorem 
Retarded baseline, 405—407 
Reuleaux triangle, 183-187 
Reynolds number, 686 
Rice distribution, 234, 408, 566 
Right ascension, 10 
measurement of, 601—606, 645 
zero of, 601 
Ringlobe, 176-178 
RMS bandwidth, 463, 608, 641 
Robust weighting, 494 
Rotation measure, 751, 753 
Round-trip phase, 264-271, 471, 487 
Ruze formula, 156 


Sampling, 312-351. See also Quantization 
digital, accuracy of, 347-351 
of bandpass spectrum, 312 
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Sampling theorem, 157-159, 175, 177, 312, 


573, 820 
Satellite 
data link, 38, 471-472 
interference from, 801, 804 
signals, Faraday rotation, 734 
time transfer, 447 
tracking, 475, 824 
Scalloping, 361 
Scattering, 750, 753-758, 783. See also 
Scintillation 
Schwarzschild radius, 542 
Scintillation 
angular spectrum of, 739, 743 
correlation bandwidth, 740 
correlation length, 740 
critical source size, 740 
Gaussian screen model, 737—742 
index, 749 
interplanetary, 748-750 
interstellar, 188, 753-758 
ionospheric, 736, 737 
neutral atmosphere, 685—690 
power-law model, 742-744 
scattering angle, 739, 743 
thin screen, 737—742 
Sea interferometer, 20 


Second, definition of, 441, 600. See also Time 


Seeing, 657. See also Scintillation 
cell, 828 
disk, 836 
Self-absorption, 4 
Self-calibration, 565-569 
Self-coherence, 778 
Sequency, 292 
Serial-to-parallel conversion, 352 
Serpukhov, Russian Federation, 28 
Sgn function, 79, 105 
Shah function, 157, 496-497 
Shift theorem, 80, 526 
Shift-and-add algorithm, 836 
Short-spacing data, 177, 576-577 
Shot noise, photon, 441, 445, 834 
Sideband(s), 208 
double, 215-223, 238-239 
fringe-frequency dependence, 115-117 
partial rejection of, 251-253 
relative advantages of single, double, 
221 
separation, 221-223 
sideband-separating (image-rejection) 
mixer, 301 
single (upper, lower), 209, 210, 212 
unequal responses, 251-253 
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Sidelobe. See also Ringlobe; Beam, 
synthesized (dirty) 
bandwidth smearing of, 243 
envelope model, 794 
Sidereal rate (Earth rotation), 119 
Signal search (VLBI), 412-419 
Signal transmission subsystems, 261—264 
Signal-to-noise ratio, 231. See also Noise 
aliasing effect, 501-503 
basic analysis, 223—240 
coherent averaging, 415—419 
of frequency standard, 436-440 
frequency response, effect of, 277—282 
fringe-frequency mapping, 634 
in images, 230-233 
in interference calculations, 793—803 
in lunar occultations, 818 
in phased arrays, 466—470 
incoherent averaging, 415-419 
intensity interferometer, 419, 814 
loss factors, VLBI, 454-461 
optical, 833-834 
quantization, effect of, 316-347 
quantum effect, 834 
receiving system, 12 
systems, relative, 235-239 
Signals 
cosmic, 4-10 
ergodic, 4, 104 
spurious, 290, 298. See also Errors 
Sinc function, 61 
definition, 61 
Single sideband mixer. See Sideband(s), 
sideband-separating mixer 
Site testing 
opacity, 693-696 
phase stability, 697-706 
SMA (Submillimeter Array). See also 
Submillimeter Array 
Smearing, circumferential. See Visibility, 
averaging 
radial. See Bandwidth, effect in maps 
Smith—Weintraub equation, 680 
Smithsonian Astrophysical Observatory 
(SAO), 14, 187 
Smoothing functions, 355 


SMOS (Soil Moisture and Ocean Salinity), 825 


Snapshot, 182, 185 
Snell’s law, 663, 745 

spherical coordinates, 666, 745 
Soil temperature, 264 
Solar imaging, 28-30, 543 
Solar system studies, 28-30, 775 
Solar wind, 744-748 


Subject Index 


Source. See Radio source 
calibration, 395, 485-489 
coherence, 777—780 
completely coherent, 780 
extended. See Extended (broad) sources 
far-field condition, 59, 90, 775 
incoherence requirement, 90, 769-771, 774 
model. See Model 
radio. See Radio source 
subtraction, 534. See also CLEAN 
algorithm 
Source catalog 
3C, 10, 27 
ICRF, 11 
Messier, 10 
NED, 5, 11 
NGC, 10 
NVSS, 11 
South Pole, 693, 697 
Space debris, tracking, 824 
Space Interferometry Mission (NASA), 
831 
Space VLBI. See OVLBI 
Spatial frequency, 66, 70, 164-166 
coverage, 165-166 
filter, 165 
Spatial incoherence. See Source, incoherence 
requirement 
Spatial sensitivity 
of aperture antenna, 576-577 
of correlator array, 164-167 
support of, 165 
Spatial transfer function. See Transfer function 
Specific intensity. See Intensity 
Speckle imaging, 835-838 
phase information, 838 
shift-and-add, 836 
Spectral 
flux density, 6 
power flux density, 6 
Spectral line(s) 
absorption, 31 
accuracy, 528 
adaptive calibration, 585 
analog correlator, 264 
atmospheric absorption, 670-673, 694 
bandpass calibration, 523-524 
bandpass ripple, 523-524 
baseline ripple, 357 
calibration procedure, 523-530 
chromatic aberration, 527 
CLEAN procedures, 585 
correlators, 353-375 
digital correlators, 353-375 
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Doppler shifts, 538-543 
reference frames, 541—543 
double-sideband observation, 221 
examples of 
CO, 10, 693 
H2, 701-702 
H20, 6, 38, 417, 631, 635 
hydrogen, 6, 31, 442 
OH, 6, 391, 631 
SiO, 6 
presentation, 528-530 
radiation. See Spectral line(s); Maser radio 
sources 
systems, 31, 353-375 
table of important, 6 
velocity reference frames, 541 
VLBI procedures, 41-42, 405, 524-527, 
631-636 
Spheroidal wave functions, 500-501 
Splatalogue, 6 
Stanford, California, 29 
Stars 
observation of, 14—18, 600, 814, 819, 831, 
835 
proper motion, 617 
Statially coherent source, 780 
Step-recovery diode, 447 
Stokes parameters, 121—123 
Stokes visibilities, 126-129 
Strehl ratio, 156 
Structure function 
phase (spatial), 681, 686-690, 743-744 
phase (temporal), 435, 689 
refractive index (spatial), 686 
Submillimeter Array (SMA), 14, 187, 470, 822 
Sun 
coronal refraction, 744—748 
gravitational effects, 616 
interference from, 533 
ionosphere, 726 
observation of, 20, 28-30, 187, 543 
relativistic deflection, 748 
solar time, 619 
solar wind, 744-748 
Sunyaev—Zeldovich effect, 522 
Superluminal motion, 38 
Support of a function, 165 
Survey interferometers, 26-28 
Swarup and Yang system, 265-266 
Symmetry, n-fold, 180 
Synchronous detector, 266, 290, 673 
Synchrotron radiation, 4, 121, 393, 752 
Synthesis imaging 
evolution of techniques, 13, 31, 34 
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Synthesized beam. See Beam, synthesized 
(dirty) 
System equivalent flux density (SEFD), 12, 
489, 526 
System temperature, 12, 225-228, 239, 486 
correction for atmospheric absorption, 
672-673 
measurement of, 300 


Tangent plane, 93, 96, 98, 581 
Taper. See Gaussian taper; Weighting 
Target source, 487 
TDRSS experiment, 14, 42 
Tectonic plates, 3, 42, 622 
Telephone signal transmission, 395 
Temperature 
antenna, 12 
receiver, 12, 257—260 
system. See System temperature 
Temperature coefficient of length, 264 
Thomson scattering (incoherent backscatter), 
734, 745 
Time 
averaging of visibility, 246-249 
definition of second, 441 
demultiplexing, 367 
International Atomic (IAT), 441, 619 
multiplexing, 274 
solar, 619 
time synchronization, 447 
transfer methods, 447 
universal time, 447, 619-621 
Timing accuracy, 117, 448 
Tipping-scan method, 672 
Tolerances in 
bandpass (frequency response), 281-282 
delay-setting, 283 
polarization, 290 
three-level sampling, 348-351 
Total electron content. See Ionosphere, total 
electron content 
Transfer function, 164-166, 169-172. See also 
Spatial sensitivity 
OVLBI, 188-191 
VLA, 185 
VLBA, 189 
Transmission lines. See Cables; Local 
oscillator, synchronization of; 
Optical fiber; Waveguide 
Traveling ionospheric disturbances (TIDs), 736 
Tripod, 157 
Troposphere. See Atmosphere, neutral 
Truncated function, 107 
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T-shaped array, 28, 168, 180 
Turbulence 

Allan variance of, 445 

in neutral atmosphere, 685—690 

inner and outer scales of, 686-690 

Kolmogorov, 685—690 

power-law relations, 690 

spectrum of phase fluctuations, 689 

structure function of phase, 686 
Two-dimensional array, 73-74, 179-193 
Two-dimensional synthesis, 73 


(u, v) plane (spatial frequency plane), 73, 92 
coordinates, 73, 92 
coverage. See Spatial frequency, coverage 
holes in coverage, 173 
in CLEAN algorithm, 556 
in interference susceptibility, 797—800 
interpolation in, 159, 161, 497-501 
(u, v’) plane, 96-98, 113-114 
(u, v, w) component, 91-93, 580-585 
in fringe-frequency averaging, 797 
in visibility (time) averaging, 246-249 
Uncertainty principle, 44, 81, 440, 834 
Undersampling, 313, 316 
Uniform weighting, 492-495 
Unit rectangle function, 78, 279-280, 498, 
517 
Universal time, 619—621 
Usuda, Japan, 188, 475 
UTR-2, 186 


Van Cittert-Zernike theorem, 1, 94, 767—775 
assumptions, 774 
derivation, 767—771 
Van Vleck relationship, 318 
Van Vleck—Weisskopf profile, 671, 679 
Varactor diode, 275, 445 
Variance matrix, 643 
Velocity standard. See Spectral line(s) 
Vermilion River Observatory, Illinois, 38 
Very Large Array (VLA), 11, 36, 261, 282, 
588, 784 
antenna configuration, 180-182 
atmospheric phase noise, 613, 691 
delay increments, 283 
dynamic range, 570 
images from, 35, 567, 568 
interference thresholds, 795, 799 
opacity at site, 693 
phase switching, 299 
phased-array mode (VLBI), 466 
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self-calibration, 567 
(u, v) spacing loci, 183 
Very Long Baseline Array (VLBA), 40, 188, 
453 
phase referencing, 612 
Very-long-baseline interferometry. See VLBI 
Visibility 
at low spatial frequencies, 571-577 
averaging, 246-249 
complex, 30, 70, 91 
fringe (Michelson), 16 
lensclean, 590 
model fitting, 510-522 
reduction due to phase noise, 277, 680-685, 
741-742 
Visibility frequencies, 115-117 
Visibility—intensity relationship, 89-91, 
767-7175 
VLA. See Very Large Array 
VLBA. See Very Long Baseline Array 
VLBI 
antenna polarization (parallactic) angle), 
141 
antennas 
in space, 188—191, 470-476 
nonidentical, 120-121 
arrays, 37—42 
astrometry, 606-616, 645 
atmospheric limitations, 609, 685 
bandwidth synthesis, 462—464 
burst mode, 465 
calibration sources, 395, 489. See also 
Phase referencing 
clock errors, 399-405 
closure phase, 421—422, 563-569 
coherence time, 392, 434-436 
coherent and incoherent averaging, 
415-419 
data encoding, 448-450 
data storage systems, 448—450 
development of, 37—42, 391-394, 748 
discrete delay step loss, 459-461 
double-sideband system, 223, 236 
fractional bit shift loss. See VLBI, discrete 
delay step loss 
frequency standards, precise, 436—440, 
446 
fringe detection, 422 
fringe fitting 
global (multielement), 419-425 
two-element, 412-419 
fringe rotation, 454-457, 461 
fringe rotation loss, 454-457 
fringe sideband rejection loss, 457-459 
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geodesy, 40 
group delay, 403, 462 
hybrid mapping, 563-564 
in geodesy, 622 
interference sensitivity, 801-803 
K-4 system, 453 
local oscillator stability, 446 
Mark I system, 392, 417, 450, 453 
Mark II, IH, and IV systems, 450, 453 
masers, mapping, 631—636 
networks, 38-41 
noise in, 407-412 
orbiting. See OVLBI 
phase calibration system, 447 
phase closure, 40 
phase noise, 233-235. See also VLBI, 
atmospheric limitations 
phase referencing, 610-616 
phase stability, oscillators, 425-436 
phased-array elements, 466-470 
polar motion observations, 620-621 
probability distributions, 407-415 
pulse calibration system, 447 
quantization loss, 392, 461 
RadioAstron, 473, 474, 636 
recording systems, 448—450 
relativistic bending measurements, 748 
retarded baseline, 405—407 
S2 system, 453 
satellite link, 38 
satellite positioning, 473-476 
sideband separation, 223, 236 
signal-to-noise ratio, 392-393, 418, 
454-461 
spectral line, 405, 460, 524-527 
TIDs, observation of, 736 
time synchronization, 447, 448 
triple product, 423-424 
VSOP project, 473, 475 
water-vapor radiometry, 700, 703-705 
VLBI Space Observatory Programme (VSOP), 
14, 43 
Voltage reception (response) pattern, 100-101, 
166, 772 
measurement of, 819 
VSOP. See VLBI Space Observatory 
Programme 


w component, 92-94, 115, 580-585, 800 
Walsh functions, 292-295 
natural order, 294 
orthogonality, period of, 291 
sequency, 292 
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Water vapor 

22-GHz line, 677 

absorption, 670-673 

effect on phase, 680-685 

maser, 43, 417, 631-636 

resonance model, 675—679 

turbulence, 685-690 

worldwide distribution, 661 
Water-vapor radiometry, 703-705 
Water-vapor refractivity, 657-663, 680 
Waveguide, 258, 261 
Weighting 

antenna excitation, 169, 576-577 

function 

atmospheric, 683 
spectral, 355-357, 524 

natural, 491, 494 

of visibility, 231-232, 491-495 
Westerbork Synthesis Radio Telescope, 35, 

128, 175, 176, 466, 499 
Westford, Massachusetts. See Haystack 
Observatory 
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White light fringe, 64, 395, 831 

WIDAR, 359 

Wide-field imaging, 95—98, 245, 487, 503-504, 
570-577, 581-585 

Wiener-Khinchin relation, 63, 87, 98-99, 225, 
312, 354, 800 

Wind spacecraft, 745 

WMAP mission, 535 


X-ray interferometry, 832 


Young’s two-slit interferometer, 44 
Y-shaped array, 180-182 


Zeeman effect, 121, 140, 442, 446 

Zenith opacity, 670, 693-696 

Zero padding, 360, 384-385 

Zero spacing problem. See Short-spacing data 


